.he 'PARRY2''Page %'
.fo ''- % -''
.mh
Conversational Language Comprehension 
Using Integrated Pattern-Matching
and Parsing


BY


Roger C. Parkison
Kenneth Mark Colby
William S. Faught

.bb
Abstract:


.hm
.pg
One of the major problems in natural language understanding by
computer is the frequent use of patterned or idiomatic phrases in
colloquial English dialogue. Traditional parsing methods
typically cannot cope with a significant number of idioms. A more
general problem is the tendency of a speaker to leave the meaning
of an utterance ambigous or partially implicit, to be filled
in by the hearer from a shared mental context which include linguistic,
social, and physical knowledge. The appropriate representation for this knowledge
is a formidable and unsolved problem. We present here an approach to
natural language understanding which addresses itself to these
problems. Our program uses a series of processing
stages which progressively transform an
English input into a form usable by our computer simulation of
paranoia. Most of the processing stages involve matching the input
to an appropriate stored pattern and performing the indicated
transformation. However, a few key stages perform aspects of traditional parsing
which greratly facilitate the overall language recognition process.
.bb
.mh
Table of contents:
.hm
Background
.sp
Comparision and contrast with related work
.sp
Outline of program operation
.sp
Description of the program
.sp
.in5
.tp12
1) Standardize teletype input
.sp
2) Identify word stems
.sp
3) Condense idiomatic phrases
.sp
4) Collect split phrases
.sp
5) Simplify verb phrases
.sp
6) Locate simple clauses
.sp
7) Combine multiple clauses into a single concept
.sp
.in0
Representation of information
.sp
Utilization of English grammar
.sp
Implentation notes
.sp
Room for improvement
.sp
Conclusion
.sp
Sample interview
.sp
References
.bb
.mh
Background:
.hm
.sp
.pg
The goal of our research is an understanding of paranoid thought
processes. To this end, we have constructed a computer simulation of our theory of paranoia. This program, called PARRY,
facilitates objective evaluation of our theory and highlights weaknesses
in it. Psychiatrists evaluate PARRY by means of a psychiatric interview.
A discrepancy between the program's performance and that of human
paranoids indicates one of two things:
either our theory of paranoia is deficient, or our program cannot do
what our theory requires. It is not possible for a 
psychiatrist to distinguish between these two problems. We 
must first minimize the differences between the program and the
theory so that psychiatrists can more easily evaluate the theory.
.pg
One of the wider gaps between the program`s performance and the
theory is in the area of natural language comprehension. Paranoid
humans are usually quite perceptive but some interviews
with PARRY take on the air of a Marx brothers comedy
with misunderstanding and partial comprehension running rampant.
Our model should be able to communicate with a psychiatrist over the
full range of concepts which can be manipulated internally.
A communication bottleneck between doctor and model obviously reduces
the doctor`s ability to percieve the internal processes.
In addition, the model must recognise and utilize a comfortable
variety of natural language expressions for each internal concept.
A doctor should not have to "learn PARRY`s language" because that would interfere with his concentration on the psychiatric aspects of the program.
.pg
The implementation of these specifications in a computer program is the
the subject of this paper. Readers unfamiliar with previous versions
of PARRY may wish to glance through the sample interview in the
appendix to acquaint themselves with the model`s
present capabilities.
.bb
.mh
Comparison and contrast with related work:
.hm
.sp
.pg
Most of the early "language-understanding" programs (Bobrow, 1968; Weizenbaum, 1966;
Raphael, 1968; Charniak, 1969) did not
function primarily as languyage analyzers. Each one performed a
task in a restricted problem domain. Any input to the program was presumed
to be meaningful within that domain. Within such strong semantic constraints,
one could almost enumerate the meaningul input sentences. With allowance for
replacement of words by synonyms and "blanks" in which any words were
acceptable, it was possible to enumerate the meaningful sentence
categories or patterns.
The number of patterns required was usually less than 50. Each of
these patterns evoked a highly specific response and thus the performance
of these programs was suprisingly "intelligent".
.pg
For our purposes with PARRY, there are two weaknesses in this approach.
First, the blanks in the stored patterns were intended to match
arbitrary noun phrases or even whole clauses, but there was no way
to verify that
a noun phrase or clause had actually been located.
Also, there was no way to modify the choice of a response procedure
based on the contents of the noun phrase or clause. This is not
satisfactory for a model which must respond more to the connation of
a statement than to its syntactic form. Second, the stored
patterns were too specific. We estimate that 100,000 specific
patterns would be needed to cover our model's range of topics and
diversity of expression. Within this vast data base, there would be many 
similar patterns and recurring features. By factoring out some
of these and using a more complex pattern-matching algorithm
(Colby, Parkison, & Faught, 1974), we obtained a few thousand abstract
patterns. These are the basis of PARRY`s present natural
language comprehension ability. Wilks has developed similar ideas in his
machine translation program (Wilks, 1973). His patterns are much
more abstract, using only a few dozen distinct nouns and verbs.
.pg
Our goal is to improve PARRY`s usefulness as a tool in our
study of the paranoid mode of thought. In order to test our
hypotheses about the development and treatment of the paranoid mode,
we are developing a new version of PARRY with the ability to become
more "normal" (or more paranoid) during a course of treatment. A
normal personality presents much greater natural language problems
for the model builder due to the opening and deepening of
areas previously closed to discussion. Subjects central to
the model`s interests will be discussed in greater detail and 
additional peripheral subjects may be touched on. We find the
resulting proliferation of stored patterns unacceptable on
both practical
and aesthetic grounds. The generation of thousands of additional
patterns is a tedious and therefore error-prone job. Also, the
recurring similarities among our "abstract" patterns become more
pronounced. Most of these are manifestations of the regularities of
English grammar. The most prominent ones are the introduction of
auxiliary verbs (DO,  BE, HAVE) in complex verb tenses and
subject auxiliary inversion in questions.
.pg
Another family of natural language understanding programs is based
directly on English grammar (Petrick, 1973; Grishman, 1973;
Kay, 1973; Thorne, Bratley, & Dewar, 1968; Woods, 1970; Heidorn,
1972; Winograd, 1972). An inherent difficulty in this approach lies 
in the fact that a grammar of English is generally expressed in a form
better suited to generation than to recognition of language.
Simple generation grammars can be mechanically inverted to produce
reasonable parsers but the complex grammars needed for natural language
must almost always be laboriously inverted by hand.
A problem which plagues all these parsers is lexical and syntactic
ambiguity. A good parser can produce 5 different analyses for any
input sentence. (A bad parser can produce 100.) Most of these
alternatives will be semantically nonsensical. One way out of this
problem is to stop with a syntactic analysis and make no 
attempt to discover the meaning of the sentence. To be
useful in a language understanding problem, a parser must produce one
most probable parse
instead of multiple possibilities. The difficulty is that the
information which selects the meaningful parse from the nonsense is 
not easily represented in the grammar. Decisions about the 
function of a prepositional phrase or even the syntactic category
of an ambigous word are based on semantic information which is highly
specific to individual situations. The point we wish to raise is that
this semantic information might best be represented in a form
ressmbling the patterns which are used in PARRY.
.pg
Idioms are another stumbling block for parsers. At best
the meaning of an idiom is unrelated to its syntactic analysis, 
and at worst an idiom might be totally unanalyzable. The writer
of a parser cannot extend his grammar to recognize one idiom  
without accepting dozens of other peculiar word combinations as
valid idioms. (It is awkward for a parser to distinguish "OUT
ON A LIMB" from "OUT ON A BRANCH".) Woods` transition network
parser contains a preprocessor to look for what he calls "compound words". We believe that any parser which recognizes a large subset of
everyday English must contain some such pattern regoniser. Grammatical but
rarely used constructions are almost as troublesome as idioms. If they
are left out of a parser, then it fails miserably on encountering them.
This abrupt failure is in sharp contrast to humans whose performance degrades only gradually in the face of unfamiliar constructions. Our
pattern-matching solution to this problem is to include some very vague
patterns which locate one or two key concepts in a sentence and disregard the rest. If rare constructions are included in a parser, then they must be handled with the same generality and completeness as the more common natural
it can make a parser painfully slow. Most of the parsers cited above
respond in form 10 seconds to a minute. In our experience with man-machine
dialogues, a person`s concentration lapses in just a few seconds if no response is
forthcoming.
.pg
A different approach to natural language comprehension is taken by Schank
(Schank, 1973; Riesbeck, 1974). His conceptual dependency paradigm
emphasizes the underlyuing semantic relationships between the parts of a sentence to the exclusion of their syntactic relationships. In keeping
with this philosophy, Riesbeck`s conceptual dependency parser primarily
seeks out a key conceptual word
(verb or nominalized verb) abd then assimilates the rest of the
sentence using extensive knowledge about the meaning of that word.
Some syntactic knowledge is also used. Information about passive constructions
and complex verb tenses is stored with the appropriate auxiliary verbs,
and information about the structure of noun phrases is stored
with determiners. This type of parser comes much closer to our
needs than the grammatical parsere mentioned earlier, but
it still takes no notice of idiomatic verb usage or the meaning of nouns.
We need an additional process to transform the literal meaning of a
sentence into its idiomatic meaning to our model. A practical difficulty
with Schank`s conceptual dependency system is its reduction of abstract
ideas to 16 primitive concepts. This
approach makes the internal representation of some of the concepts
which are central to the model`s beliefs (e.g. GAMBLE, INCRIMINATE, SPY-ON) quite unwieldy.
.pg
Speech understanding programs are usually organized so that recognition
expands outward form key words also (Woods, et al, 1972;
Reddy, et al, 1973; Walker, 1973; Miller, 1975). This is because 
the occurrence of contentive words can be predicted from
prior context while the smaller function words, which written language
analyzers utilize, are almost impossible to locate in speech input. The need
for accurate contextual prediction restricts present speech understanders
to very small vocabularies and problem domains.  It is not clear that
their sentence analysis methods would be feasible in
larger domains with larger vocabularies.
.bb
.mh
Outline of program operation:
.hm
.sp
.pg
The program described below is being written but it has not
yet been subjected to the same rigorous testing which determined
the previous version`s ability to cope with natural language.
.pg
The model`s language recognition phase must find a connection between
a doctor`s
typed input and the model`s internal concepts.
The discovery of such a connection proceed in several stages. At each stage,
certain complexities of natural language expression are extracted, leaving
a slightly more conceptual representation of the input.
Briefly, the 7 stages are:
.sp
1) Standardize teletype input:
.sp
This stage shelters the remaining stages from the vagaries of
teletype hardware and human typeing. All letters are converted
to upper case, unrecognized characters are deleted, and punctuation
is separated from neoghbouring words.
.sp
2) Identify word stems:
.sp
A morphological analysis is done to remove suffixes from words.
Contractions, inflectional endings, and derivational suffixes are all
removed and replaced by appropriate words or markers.
.sp
3) Condense idiomatic phrases:
.sp
Various types of rigid, multi-word phrases are replaced by
synonymous words. For example:
.sp
proper noun;
 		UNITED STATES OF AMERICA >> USA
.sp
compound word;
.sp
 		IN SPITE OF >> DESPITE
.sp
idiom;
.sp
 		BET possessive BOTTOM DOLLAR >> RELY
 		("possessive" represents one of MY, YOUR, HIS, etc.)
.sp2
Also, simple noun phrases are locatred and bracketed using a transition
network grammar. Simplification of complex verb phrases is postponed until stage V.
.sp
4) Collect split phrases:
.sp
Phrases which can be divided by embedded noun phrases are joined 
together. For example:
.sp
subject-auxiliary inversion;
.sp
 		IS [THE MAFIA] FOLLOWING YOU?
 			>>[THE MAFIA] IS FOLLOWING YOU?
.sp2
verb and particle;
.sp
 		PICK [YOUR FATHER] UP> >> PICK=UP [YOUR FATHER].
.sp2
5) Simplify verb phrases:
.sp
Certain features of verb phrases, such as tense, modal verbs,
and adverbs, are removed and noted in a pre-determined set of adverbial
variables.
.sp2
6) Locate simple clauses:
.sp
The input is next broken into simple clauses using a few thousand
stored patterns. The sentence "Could you tell me why you think 
you shouldn`t be there?" is analyzed as:
.sp
 		(YOU TELL ME)(YOU THINK) (YOU BE THERE)
.sp
The missing words are remembered in the adverbial variables 
mentioned above (e.g. WH-=WHY is associated with (YOU THINK)). The
effect of these variables is described in a later section.
.sp2
7) Combine multiple clauses into a single concept:
.sp
The sequence of simple clauses found in the input is matched
against additional stored patterns to locate ideas which
are typically expressed in multiple clauses (e.g. "What do you do
after you get off work?").
.bb
.mh
Description of the program:
.hm
.sp
1) Standardize teletype input:
.pg
An important requirement for the model`s language recognition
phase is that it "keep a low profile".
Its main goal is to reveal the workings of a paranoid model and
not to enforce proper typing style.
Therefore, the program is as forgiving as possible regarding 
capitalization of words, irregular spacing among words and
punctuation, and erroneous characters due to careless typing or
hardware errors. Letters are converted to upper case,
unrecognized characters are deleted, and punctuation is
separated from adjacent words.
.sp2
2) Identify word stems:
.pg
Each word is first looked up in the primary dictionary.
If it is found, attention shifts to the next word. If it is not 
found, a small dictionary of frequent misspellings is consulted. If
the word appears there, it is replaced by its correct spelling. 
(Also, a note is made for future use in evaluating the inteviewer.)
If the word is not found, the trailing letters of the word are 
compared against all possible forms of suffix in a suffix dictionary.
.pg
Contractions are treated as a form of suffix. They are replaced
by fully spelled-ut words. Since some people omit the 
apostrophe or can`t find it on a teletype keyboard, most contractions are
accepted either with or without it. This could conceivably cause
"COINCIDENT" to be interpreted as "COINCIDE NOT", just as "DONT"
means "DO NOT". To prevent this kind of overanalysis, the
remaining word must belong to the category which normally allows the
indicated contraction. In the case of "NOT", only the primary
(DO, BE, HAVE) and modal (COULD, WOULD, etc.) auxiliaries permit
contraction.
.pg
Inflectional endings are removed from plural nouns, third
person singular verbs, and tensed verbs. For tensed verbs, a
marker is inserted after the verb indicating past (-ED), present
participle (-ING), or
past participle (-EN). A number of verbs have no past particple
distinct from the simple past form. Hence, later sections
of the program which look for particples also accept past
frms. Many of the
most commonly used verbs in English have irregular inflections. These
are recorded in a seperate dictionary of irregular words. A few nouns
and pronouns with irregular plurals are also listed there. For example:
.sp
 		WAS >> BE -ED
 		CHILDREN >>CHILD
.sp
.pg
Derivational suffixes are also utilized. Examples are the
analysis of "RESTFULNESS" into "REST + FUL + NESS" or
"CERTAINLY" into "CERTAIN + LY". Each suffix has a predictable 
effect, transforming one class of word (e.g. verb, noun, adjective)
into a semantically related in a different class. (See Quirk, et al,
1972 for a thorough treatment of this topic.) This information alone
is enough to determine that "RESTFUL" is an adjective or that "RESTFULNESS"
is a noun, given that "REST" is a verb. Where a more
specific meaning is needed, a network of semantic relations connects smaller
word categories. To elaborate on the present example,
suppose the following information is stored explicitly:
.sp2
 		REST means RELAX
 		RELAX is a verb
 		TRANQUIL means CALM
 		CALM is an adjective
 		PEACE is a noun
 		RELAX, CALM, and PEACE are semantically related
.sp2
Then all the following information can be deduced from derivational
rules:
.sp2
 		PEACFULNESS, TRANQUILTY, & RESTFULNESS mean PEACE
 		PEACEFUL, RESTFUL, TRANQUILIZED, & RELAXING MEAN CALM
 		TRANQUILIZE & PACIFY mean RELAX
.sp2
Actually, the participal verb forms must be considered as possible verbs
as well as adjective, and our present program cannot discover the
root, "PEACE", in "PACIFY", but they were included to illustrate the
usefulness of derivational analysis. A smal dictionary can be used to
in meaning.
.pg
A few prefixes are recognized and removed also. The semantic effect of
these prefixes is approximated by the insertion of appropriate adverbs
into the sentence. For example:
.sp2
 		INSANE >> NOT SANE
 		UNHAPPY >> NOT  HAPPY
 		MISINFORM >> BADLY INFORM
.sp
.pg
When all else fails, a word is treated as a possible typing error. Typical 
typing errors are omitting the space between words,
transposing two letters, or missing the
intended key and striking a nearby key instead. Other possible
errors are typing the number 0 for the letter O, or neglection the shift
key and typing a seven instead of an apostropthe. Some teletypes will
occasionally send a double letter when only a single letter
was typed. If the teletype is connected to the computer by telephone lines
(as is often the case with PARRY) then noise on the line can generate
extra characters. Some human spelling errors are systematic enough
to be correctable. Among these are doubling of consonants (UNTILL >> UNTIL), confusing vowels in unaccented syllables (DEXIDRINE >>DEXEDRINE), and
trandposing I ans E (CONCIEVE >> CONCEIVE).
.pg
.tp15
All of these errors can be corrected by systematically
applying a few "respelling" rules. These rules are:
.sp2
 	1) Delete any single letter from  the word.
 	2) Transpose any pair of adjacent letters.
 	3) Replace any letter by another which is
 	   similar or nearby on the keyboard.
 	4) Split into two words.
.sp2
Each of these rules is tried in turn until the resulting
word appears in one of the dictionaries. When a word
is totally unrecognizable, it is deleted from the input. 
We have found empirically that it is not useful
to permit two corrections to the same word. The resulting
increase in recognition consists almost entirely of misrecognition 
with a negligible increase in valid typing correction.
.sp2
3) Condense idiomatic phrases:
.sp
.pg
There is a whole spectrum if expressions including compound
words, proper nouns, idioms, and formulaic sentences sharing the
property that the meaning of the expression cannot be obtained by
combining the meanings of the parts. These expressions occur
frequently in colloquial English so the model must recognize them. This
is accomplished by matching each part so the model must recognize them.
This is accomplished by matching each part of the input against a
collection of idiom patterns and replacing the parts that
match with more literal synonyms. The elements in the stored
patterns are allowed to match either (1) specific words or (2)
specific verbs, ignoring tense markers or (3) small categories of prounouns.
The tense of an idiom is preserved in its literal translation. Most 
pronouns found in idioms refer to other nouns in the same sentence. These
redundant pronouns are not preserved. The following examples
illustrate a variety of idioms and related expressions:
.sp2
Compound words;
.br
 		ALL RIGHT >> ALRIGHT
 		IN SPITE OF >> DESPITE
.sp
Proper nouns;
.br
 		COSA NOSTRA >> MAFIA
 		F. B. I. >> FBI
.sp
Noun + Adjective;
.sp
 		STRAIGHT JACKET >> RESTRAINTS
 		EMERGENCY ROOM >> HOSPITAL
.sp
Preposistion + object;
.sp
 		ON possessive TOES >> ALERT
 		BESIDE reflexive >> UPSET
 		AT THE MOMENT >> NOW
.sp
Verb + Particple; 
.sp
 		MAKE UP WITH >> RECONCILE WITH
 		COP OUT >> NOT HELP
.sp
Verb + Object;
.sp
 		SEE RED >> BECOME ANGRY
.sp
Verb + adverbial phrase;
.sp
 		GO TO PIECES >> CRUMBLE (this is still metaphorical)
.sp
Verb + object + particle;
.sp
 		HAVE IT IN FOR >> HATE ("IT" has no referent at all)
.sp
Verb + Object + adverbial phrase;
.sp
 		HAVE possessive HEART IN possessive MOUTH >> BE AFRAID
.sp
Formulaic sentences;
.sp
 		HOW DO YOU DO? >> HELLO.
 		HOW GOES IT? >> HOW ARE YOU?
 		PARLEZ VOUS FRANCAIZ? >> DO YOU SPEAK FRENCH?
.sp
.pg
After the idiomatic substitutions have been made, a transition network
grammar is used to locate simple noun phrases.
Each group of premodifiers, adjectives, noun and following prepositional
phrases introduced by "OF" is
bracketed so that it can be treated as a single unit in later processing.
For example, (DO YOU LIKE YOUR FATHEER) becomes (DO YOU LIKE [YOUR
FATHER]). At this stage, no attempt is made to analyze the over-all
structure of the input sentence. Thus it is difficult to 
determine the intended usage of many ambiguous words
(e.g. WORK, BEATING).
These words are treated as nouns if they are preceded by premodifiers
or adjectives but they are left un bracketed otherwise. The failure
to bracket these nouns when they are used alone is not a problem
since a phrase consisting of one word is treated as a single unit whether
it is bracketed or not.
.pg
An attempt is made to identify the primary noun of each bracketed group.
In most cases, it is the final word of the group (e.g. [MY FATHER`S DOG],
[A GALLON OF GAS]). But sometimes the primary noun isn`t at the end
(e.g. [A FEAR OF DEATH]). The general rule is to select the last noun
before an occurrence of "OF" unless it is a quantifier or measure
(e.g. LOT OF, GALLON OF). In that case the first "OF" is skipped and
the last noun before the second "OF" is selected. This process is repeated
until a noun is selected or the end of the phrase is reached.
.sp2
4) Collect split phrases:
.sp
.pg
A verb phrase can be distributed throughout a sentence so that it
surrounds the subject, or the object, or both. The subject
of a question is surrounded via subject-auxiliary inversion. Such inversion
is recognized and replaced by declarative word order with a marker added to
indicate an interrogative sentence. The general form of the patterns used is:
.sp
 	auxiliary noun-phrase verb >> noun-phrase auxiliary verb?
.sp
Certain verbs known as semi-auxiliaries require special patterns
because they contain idiomatic adverbs or prepositions before the
main verb (e.g. HAD BETTER, BE ABOUT TO).
.pg
Many idiomatic verb constructions allow the object of the verb to
be embedded within the idiom. Some idioms with fixed objects have
been previously dealt with. The idioms which accept embedded objects
can be categorized like the more rigid idioms:
.sp2
Verb + Participle;
.sp
 		PICK noun-phrase UP >> PICK-UP noun-phrase
 		PICK UP noun-phrase >> PICK-UP noun-phrase
.sp
A single pattern with an optional noun phrase after "PICK" is used
to cover both of these cases.
.sp2
Verb + adverbial phrase;
.sp
 		COME TO [noun-phrase -'s AID] >> HELP noun-phrase
 		COME TO [THE AID OF noun-phrase] >>HELP noun-phrase
.sp
The actual pattern used looks for " COME TO " followed by a noun
phrase whose head is "AID". Then a routine located the nested
noun phrase either before the head noun or after the "OF".
.sp2
Verb + object;
.sp
 		GET [noun-phrase -`s GOAT] >> IRRITATE noun-phrase
.sp2
verb + object + adverbial phrase;
.sp
 		RUB noun- phrase [THE WRONG WAY] >> IRRITATE noun-phrase
.sp2
Verb + object + object;
.sp
 		LEND noun-phrase [A HAND] >> HELP noun-phrase
 		LEND [A HAND] TO noun-phrase>> HELP noun-phrase
.sp2
5) Simplify verb phrases:
.sp
.pg
A verb phrase consists of a main verb, a collection of auxiliaries
indicating tense, voice, and modality, a possible inversion
indicating interrogation, and some adverbs. The goal of this stage of
processing is to remove everything except the main verb from the phrase
and record the significance of the removed words in a small set 
of adverbial variables. The present set of adverbial
variables is TENSE, MODAL, INTERROGATIVE, NEGATIVE, WH-, TIME, 
and OTHER.
.pg
.tp10
The meaning of TENSE is made clear by the following examples:
.sp
 		DO verb >> verb
 		verb -ED >> verb + TENSE= past
 		HAVE verb -EN >> verb + TENSE=past
 		BE verb  -ING >> verb + TENSE=progressive
.sp

Combinations of auxiliaries are simplified by repeated application of
elementary rules. Thus,
.sp
 		HAVE BEEN verb -ING >> verb + TENSE=past progressive
.sp
Passive constructions are converted to active ones with the insertion
of a dummy subject when none is evident.
.sp
 		noun-phrase1 BE verb -EN by noun-phrase2
 			>>noun-phrase2 verb noun-phrase1
 		noun-phrase BE verb -EN >> SOMEBODY verb noun-phrase
.sp
Modal verbs (e.g. COULD, SHOULD) are deleted from the sentence and put
into the variable named MODAL. "WILL" is treated as a modal
rather than as an indicator of future tense. The semi-auxiliaries
are converted to similar modal verbs and then treated as modals.
For example:
.sp
 		HAD BETTER >> MUST
 		BE ABOUT TO >> SHALL
 		BE ABLE TO >> CAN
.sp
Any interrogative inversion has already been undone so all that remains
is to move the marker that was left in the sentence into the
INTERROGATIVE variable. A question mark at the end of the 
clause can also set this variable.
.pg
The treatment of adverbs is less firmly based in English grammar. The
adverbs selected for special recognition and those relegated to the
OTHER category were empirically determined. Naturally, the
groups selected bear some relation to grammatical sub-categories
but no claims are made for uniformity or completeness. Three
categories of adverbs have been selected for their frequency
of occurence and important contribution to the meaning of sentences.
They are NEGATIVE (e.g. NOT, NEVER), WH- (e.g. WHEN, WHERE, WHY), and
TIME (e.g. EVER, OFTEN, PREVIOUSLY).
.sp2
6) locate simple clauses:
.sp
.pg
During this stage, the input sentence is segmented into simple clauses
or clause fragments by matching it against a few thousand patterns.
When an initial portion of the input matches a pattern, that portion
is broken off, and the remainder of the input is again matched against the
patterns. Many of the patterns are quite specific and refer to concepts
which are important to the model (e.g.(I LIKE YOU), (YOU BE SICK)).
Other patterns are vague and represent more abstract concepts (e.g. (YOU LIKE
noun-phrase)). Clause fragments appear because some vaerbs commonly
take a clause as an object. Clausal objects are isolated rather than nested within the main clause, thus the main clause appears to be without an object
For example:
.sp
 		I FEEL YOU NEED [ELECTRIC SHOCK].
 			>> (I FEEL) (YOU NEED [ELECTRIC SHOCK])
.sp
Each pattern has a unique name within the program. A list of name of the
patterns which matched a sentence is passed on to the next stage of processing.
.pg
Associated with each pattern name is information indicating which 
adverbial variables (see stage 5) can significantly alter the meaning of
the pattern. Typically, INTERROGATIVE, WH-, and NEGATIVE are important
variables. Consider the example used earlier:
.sp
 		COULD YOU TELL ME WHY YOU THINK YOU SHOULDN`T BE THERE?
 		>> (YOU TELL ME) (YOU THINK) (YOU BE THERE)
 		>> TELL-ME YOU-THINK YOU-BE-IN-HOSPITAL
.sp
With the following transformations:
.sp
 		TELL-ME + MODAL=COULD >> no change
 		TELL-ME + INTERROGATIVE=? >> no change
 		YOU=THINK + WH-=WHY >> WHY-DO-YOU-THINK
 		WHY-DO-YOU-THINK + INTERROGATIVE=? >> no change
 		YOU-BE-IN-HOSPITAL + MODAL=SHOULD
 			>> YOU-BELONG-IN-HOSPITAL
 		YOU-BELONG-IN-HOSPITAL + NEGATIVE=NOT
 			>> YOU-BELONG-OUT-OF-HOSPITAL
.sp
the final result is:
.sp
 		TELL-ME WHY-DO-YOU-THINK YOU-BELONG-OUT-OF-HOSPITAL
.sp
(These long pattern names were chosen to clarify the example. The
names actually used in the program are much shorter mnemonics.)
.pg
The removal of WH- words from a question sometimes exposes its 
false presuppositions. Thus:
.sp
 		WHEN DID YOU BEAT [YOUR WIFE]?
 			>> (YOU BEAT [YOUR WIFE]) + WH-=WHEN,INTERROGATIVE=?
 			>> YOU-BEAT-WIFE +  WH-=WHEN, INTERROGATIVE=?
.sp
The valuses of WH- and INTERROGATIVE have no effect on the pattern, YOU-BEAT-WIFE,
so the un-assimilated features are made available to the remaining phases
of the model along with the pattern name, YOU-BEAT-WIFE.
.pg
The procedure which matches stored patterns against the input has the
flexibility to accomodate abstract patterns. A word
in  a pattern will match (1) a bracketed noun-phrase with the
specified word as the head noun, or (2) a pronoun which presently
refers to the pattern word, or (3) a specific word which is an instance
of the more general pattern word. The permissible generalizations are given in
the primary dictionary. For example:
.sp
 		I LIKE [FURRY ANIMALS]. >> (I LIKE ANIMALS)
 		I LIKE [DOGS]. >> (I LIKE ANIMALS)
 		I LIKE [BIG BROWN DOGS], >> (I LIKE ANIMALS)
 		I LIKE THEM. >>(I LIKE ANIMALS)
.sp
The specific input words which match abstract pattern elements are
passed on with the pattern name to the later phases of the
model. There is no limit to the generality or vagueness allowed
in patterns. This allows the program to recognize a sentence
at a level of abstraction consistent with the internal representation of the
concept. If the model deals internally with concepts like
YOU-HAVE-PHOBIAS or YOU-DEAL-WITH-GANGSTERS then it is pointless
to distinguish specific instances of these concepts on input. It is more
direct to have recognition patterns just as abstract as the internal concepts:
.sp
 		DOES [TRAFFIC] SCARE YOU? >> (phobia SCARE YOU)
 		ARE YOU FRIGHTENED BY [STORMS]? >> (phobia SCARE YOU)
 		DO YOU TALK WITH [HOODLUMS]? >> (YOU verb WITH HOODS)
 		HAVE YOU EVER EATEN WITH [A GANGSTER]?
 			>> (YOU verb WITH HOODS)
.sp
This approach can be extended into areas totally irrelevant to the
model. Broad categories of irrelevant or flippant questions can be
fielded with a few patterns:
.sp
 		HOW OLD IS [PRESIDENT FORD]? >> (HOW OLD BE noun-phrase)
 		HOW FAR IS IT TO [THE MOON]? >> (HOW FAR BE IT TO 
 							noun-phrase)
.sp
.pg
Some concepts are stored abstractly because the model is unconcerned
with the details. However, others are stored abstractly
because the program 
has extensive knowledge of the details via a  procedure and it
would be wasteful to enumerate all the possiblities before they
arise in a dialogue. The area of subjective reactions is a good
example. Due to the importance of affect (i.e. emotion) in the
model, a reaction (e.g. like, dislike) is attached to almost every internal
concept. Therefore, a single procedure can answer any question
of the form, "Do you like...?". All of these questions match the
pattern, (YOU, LIKE noun-phrase), and ultimately lead to the pattern
name, DO-YOU-LIKE-X. IF the model determines that it`s
appropriate to answer the question, then the reaction-finding procedure
is given the specific input word which matched the "noun-phrase" in
the pattern.
.pg
When an input sentence fails to match any pattern exactly, a fuzzy match
is attempted. Fuzziness can be achieved by matching words from the match,
or by skipping over words in the input or pattern.
Unrestricted use of these maneuvers would permit any sentence to
match any pattern, but judicious use produces useful additional recognition ability. Present restrictions allow arbitrary matching only when
the pattern element is extremely vague (e.g. "verb"). Skipping of pattern
elements is not permitted at all. Thus, no specific contentive word will
be assumed to be in the 
input unless it is actually found. This biases fuzzy recognition away
from specific concepts and towards broad, general concepts. Only word of the
input expression can be skipped in each clause (a clause being usually 3-5
words long). The loss of one word makes clause vague or cryptic but
the loss of two words usually causes a serious misrecognition of its meaning.
These restrictions have been chose to yield the best performance in PARRY`s 
situation. Another system, with different relative penalties for lack of
comprehension versus mistaken comprehension, would need different restrictions.
.sp2
7) Combine multiple clauses into a single concept:-
.sp
.pg
There is frequently more than one simple clause in a single sentence.
Sometimes it is just a sequence of separate ideas. For example:
"Hello, I`m Doctor Smith, who are you?". There is no higher-level
language recognition pattern which connects these concepts.
(Although other sections of the model could recognize this sequence
as the beginning of an interview.) Often, simple clauses are
incomplete or inter-related and meaningful recognition requires the
combination of multiple clauses into a single concept. Typical examples 
are sentences with embedded or subordinate clauses:
.sp
Adverbial clause:
.sp
 		DO YOU EVER HAVE ANY PROBLEMS WHERE YOU WORK?
.sp
Subordinate clause;
.sp
 		WHAT WOULD YOU DO IF YOU GOT OUT OF THERE?
.sp
Clausal object;
.sp
 		DO YOU KNOW WHY I AM HERE?
.sp
.pg
A sentence containing more than one simple clause is matched against
a collection of sentence patterns. At this stage of processing, the
original input sentence has already been reduced to a list of
simple pattern names. The matching at this level has a number of parallels
with the earlier clause matching. The elements of a sentence pattern are
either internal pattern names or "wild-cards" which match any pattern. 
The function of a wild-card in a sentence pattern is analogous to the
function of a very abstract element in a simple clause pattern. Sometimes
the meaning of a pattern which matches a wild-card is unimportant,
and sometimes a general procedure exists which can construct the meaning
of the sentence based on the meaning of the sentence based on the 
meaning of the embedded pattern.
.pg
This feature will be used increasingly as the model`s general
reasoning abilities are expanded. Right now, even the
apparently easy cases turn out to be quite difficult. Consider
questions of the form, "DO YOU KNOW...?". It is tempting to have
a pattern:
.sp
 		(DO-YOU-KNOW any-clause) >> DO-YOU-KNOW-X
.sp
Then a procedure could simulate the asking of the embedded question
and respond with "YES" or "NO" depending on the outcome. This captures
the literal meaning of the question but that its not always the
intended meaning in a dialogue situation.  In a human dialogue, each
participant brings with him an extensive mental model of the
world and quickly form a mental model of the other participant. 
The influence of these mental models can be seen in the widely
varying responses to syntactically similar sentences:
.sp
 		Q. DO YOU KNOW WHY I AM HERE?
 		A. YES.
 		Q. DO YOU KNOW WHAT YOUR DOCTOR`S NAME IS?
 		A. DR. SMITH
 		Q. DO YOU KNOW WHERE YOU ARE?
 		A. OF COURSE I DO, WHAT ARE YOU IMPLYING?
.sp
In the first case, the literal meaning is probably intended. In
the second case, the answer to the literal question is fairly obvious
so the answer to the embedded question is probably desired. In the third
case, both the literal and embedded answers are obvious so it must be
some sort of unusual question. A general understanding of this type
of question requires a model of what people know in general and what
interviewers know in particular. Our program does not contain
the information necessary to make these judgements during an interview.
The program`s model of an interviewer extends only to style
(e.g. dominating, initiating), value (e.g. helpful, useless),
and rapport (e.g. friendly, hostile). Decisions about what is obvious
to whom must be made externally and incorporated into the program
as needed. We have found specific sentence patterns to be the most
economical and uniform way to encode this information. (See Lakoff, 1972
for further treatment of this topic.)
.bb
.mh
Representation of information:
.hm
.sp
.pg
Throughout the recognition program, a sentence is represented as a
linear list. With the exception of noun phrase bracketing, no
tree or network representations are created. When a group of words in a 
sentence is recognized as a meaningful constituent
(e.g. idiom), it is removed and replaced by a simpler of more literal 
group of words. The processing of a sentence is divided into stages
which must be performed in a specific order. Each stage removes 
only those phrases which no later stage is interested in , and each stage
inserts ony those phrases which later stages can handle. Using this
procedure, there is no need for a data structure which can retain the
original expression of a sentence a long with the intermediate analyses.
After the idiomatc and grammatical substitutions, the resulting sentence is
matched against stored patterns. The name of the pattern which best matches
the sentence is passed on to the remaining phases of the model. Additional, 
detailed information is also available, giving the words
from the sentence which matched each pattern element. The pattern name
indicated which existing internal concept is most similar to the
idea expressed in the input sentence.
Naturally, familiar ideas are recognized most accurately, but novel
ideas can be represented and transmitted using general internal concepts
instantiated by a few words selected from the input sentence.
.pg
An important decision about the representation of data within PARRY,
or any program, is the choice between using (a) very specific data with
simple procedures to apply it and (b) generalized data with
sophisticated procedures to apply it. Wherever possible, linguistic
information has been factored out into individual procedures leaving smaller,
generalized data tables. For example, a procedure was
written to deal with inflectional and derivational word endings so
that the dictionary need only contain root words (e.g. CONCLUDE but
not CONCLUDING or CONCLUSION). Another procedure removes complex verb tenses, passives, and interrogatives so that a single pattern can be
used to recognize an entire family of related sentences.
.pg
More sophisticated linguistic information is left implicit in the
clause patterns, of lack thereof. For instance: the only pattern in which the verb "RAIN" appears has "IT" as a subject. This conveys the information 
that (1) no other subject is permitted with "RAIN"
and (2) "IT" is not being used as a pronoun for which a referent is required. Storing highly specific patterns has the advantage
that is is possible to handle some instances of a phenomenon
(e.g. subject-verb-object compatability) without solving it in complete generality. The corresponding disadvantage is that each instance requires another independent piece of data (e.g. (IT SNOWS)), whereas a piece
of information in general procedures or tables would
propagate throughout the recognition process. The pattern tables are
a large reservoir of special cases which the program is
unable to analyze further for one reason or another. Numerous instances
of our own linguistic, cultural and physical knowledge are encoded in
the patterns. All of this information is necessary for adequate performance
in a dialogue situation, but we did not wish to approach
the entire problem of human knowledge representation in a general way at this time.
(It is interesting to note that many researchers who began with
an interest in natural language processing are now working on knowledge
representation. )
.pg
The program uses primarily declarative data representations,
although some data is interpreted by complicated procedures. This facilitates
incremental improvements on a time scale much shorter than is usually required
for recoding procedures. In many instances, a lack of data can be detected internally. Occasionally, the probable meaning of an unknown word or
clause can be inferred and a rudimentary form of learning takes place. More
often, human assistance must be sought. In an interivew situation, it is not
appropriate for a model designed to simulate human thought processes to ask
for linguistic assistance. However, in a data-acquistion situation
involving the programmer, the program is able to pinpoint the problem, ask a specific
question, check the answer for consistency with other data, and manipulate the answer into
the appropriate internal format. These human engineering features are
a valuable addition to any program which must deal with large
amounts of data.
.bb
.mh
Utilization of English grammar:
.hm
.sp
.pg
Since the program being described attempts to recognize English
language expression, it is strongly influenced by the surface structure
of English. However, the ultimate goal is a recognition of the meaning
of an expression rather than an analysis of its syntax. Therefore,
grammatical analysis is only utilized when it appears to be the easiest
path to the meaning of an expression. In some instances, rules describing
grammatical structure are represented explicitly in the program. The
most prominent uses of traditional grammatical rules are in the
simplification of complex verb tenses, passive constructions, and subject-auxiliary inversions.
Traditional grammatical rules governing the structure of simple noun
phrases are contained in a transition network which brackets noun phrases.
Grossly simplified, the network indicates that a noun phrase may consist of
a determiner, some adjectives, some nouns, and some prepositional phrases
introduced by "OF". At least one noun must be present but all the other constituents are optional. Although less explicit, the treatment of adverbs
and some idiomatic adverbial phrases corresponds closely to their function
as modifiers of verb phrases. In the traditional analysis, modal verbs are a
special case sharing a few properties with the primary auxiliaries. In the
program, modal verbs are treated as auxiliaries up to a certain stage, after which they
are treated like adverbs. Other facets of English grammar are left implicit
in the clause patterns and sentence patterns.
.pg
The much lamented ambiguity inherent in many English expressions
has already been mentioned. One source of this ambiguity is single
words with multiple meanings. These words do not cause serious problems
for the program because the exact meaning of an isolated word is not
important in the program. A meaning (i.e. internal pattern name) is assigned
to an entire clause when it matches a stored pattern. Given the context
of the pattern, it is easy for a human reader to identify the
intended sense of each word, but this information is not explicit in the
program. If the stored patterns were composed primarily of abstract terms such as "noun phrase" or "verb" then the problem of ambiguity would arise
during the pattern matching process. This does not happen since our patterns
contain a preponderance of specific words or synonym-class names. 
The matching alogorithm attempts to match specific patterns before general ones so that the program may utilize any special-case information that is available before falling back on general rules.
The matching of general patterns is flexible enough to accept some forms of
lexical ambiguity. The most common lexical ambiguities are between semantically
related verbs and nouns or nouns and adjectives. The program`s dictionary is tree-structured, with each word in a single synonym-class and, ultimately, assigned
to a single part of speech. A few sample entries from the
dictionary illustrate this structure:
.sp
 	verb
 		KILL
 			MURDER
 			ASSASSINATE
 			RUB=OUT
 		PUNCH
 			SLUG
 			HIT
 			CLOBBER
.sp
In the case of an ambigous word, this assignment represents the most
common (i.e. expected) sense of the word. In the example above, "SLUG"
is expected to refer to physical assault rather than to a small amount
of liquor. If a sentence fails to match any pattern
directly, then nouns are permitted to match verbs slots in the patterns
and vice versa. For example, (TELL ME ABOUT MURDER) will not
match any stored pattern directly because "MURDER" is a verb, but it will be
permitted to match the patttern (TELL ME ABOUT noun-phrase) after no direct
match is found. When an ambigous word is detected in an unexpected sense, 
it can be replaced by an unambious word of the appropriate type
using the semantic relationship pointers established for derivational
suffix removal. In the example above, "MURDER" would be replaced with "HOMOCIDE".
.bb
.mh
Implementation notes:
.hm
.sp
.pg
The latest version of PARRY is being written in MLISP, which translates directly into UCI-LISP. After an initial testing period, any troublesome
computational bottlenecks will be replaced by equivalent assembly language
procedures. From past experience, we anticipate that execution speed or core size problems will only occur in those procedures which directly reference
the larger data tables. The larger tables are:
.tp8
.sp
 	A dictionary of about 3000 root words.
 	A table of about 1000 idioms and other multi-word phrases.
 	A table of about 2000 clause patterns.
.sp
There are 3 more medium-sized tables which could potentially grow much
larger:
.sp
 	about 500 semantic relationship classes among the dictionary words.
 	about 500 adverbial transformations to be applied to clause patterns.
 	about 500 sentence patterns (i.e. combinations of clause patterns).
.sp
Additionally, there are a half dozen small tables which will probably
never exceed 100 entries each (e.g. inflectional endings, irregular
verbs).
.pg
All the data tables are presently stored as ordinary LISP expressions on property lists. The multi-word phrases and patterns are stored as tree
structures with identical initial portions of phrases of patterns merged
into a single branch of the tree. This eliminates one source of redundancy in the pattern matching procedure. The remaining redundancy is manageable
because the average phrases and patterns are only 3 to 5 words long.
.bb
.mh
Room for improvement:
.hm
.sp
.pg
The program already described will provide a noticeable improvement
in language recognition ability over the previous versions of PARRY.
Concurrently, the remaining phases of PARRY are also being rewritten. The
new version of PARRY will be able to make more effective use of the
broader range and generality of recognition which will be available.
.pg
Separation of the recognition process into distinct stages produces
efficienty but it also prevent a useful form of information
feedback during sentence recognition. Since all word indentification is
done before clauses are sought, it is not possible to use global
information to determine, for example, if "YOR" is a misspelling of
"YOUR" or "OR". Increased parallelism among the various stages of processiing
could alleviate this problem.
.pg
An interesting extension would be the modification of the model`s
recognition procedures based on the current affect (i.e. emotion) levels.
Presently, a sentence is recognised identically each time it is entered,
although the remaining phrases of the model may choose to respond differently based on affect levels. In
situations of extreme affect (i.e. rage, terror) it is plausible that
recognition becomes so distorted that the level-headed interpretation of a sentence is not available at all.
.pg
A problem of immense importance in dialogue situations is ellipsis. After
a simple sentence, a wide variety of elliptic responses is possible.
Consider the sentence:
.sp
 		The Mafia follows me.
.sp
This can be followed by:
.sp
 		The Mafia?
 		Who?
 		Who does?
 		Follows you?
 		You?
 		Really?
 		They do, do they?
 		So do I.
 		Me too.
 		Even in the hospital?
 		Why?
.sp
In isolation, these statements are either meaningless or highly
ambiguos. They become meaningful only when viewed in the context of 
the previous sentence. The previous sentence provides a basic
structure which is modified in a systematic way by the collection of
pronouns, auxiliaries, and averbials in an elliptic
response. A few instances of ellipsis are handled in the present
program with forward anticipation but the problems seems better
suited to a solution which only looks back after an ellipsis is
encountered.
.pg
Closesly related problems are sentences which are syntactically well formed but loaded with pro-forms (e.g. DO) and
anaphora to the point that they are meaningless out of context. The following sentences are examples:
.sp
 		What do they do?
 		I think so too.
 		Why do they do that?
.sp
It is necessary to locate complex anaphoric references like "so" and "that"
as well as the usual personal pronouns. We have a number of ideas for
the solution of these problems which we intend to test shortly.
.bb
.mh
Conclusion:
.hm
.sp
.pg
The language-recognition program in this paper functions within an
unusual set of constraints. The most difficult to satisfy is that
an untrained user must be entirely comfortable
with the dialogue. No facet of colloquial natural language usage can
be prohibited by fiat. A system which establishes such an
outrageous constraint must give something in return. Our paranoid model has its
own intentions, independent of the input, and these serve to focus the
conversations into certain, pre-determined areas with which the
language recognizer is familiar. given some such focussing influence,
our language-recognition methods could be applied to any other problems
domain which requires man-machine communication in unrestricted natural language.
.bb
.mh
Sample interview:
.hm
.sp
.pg
A diagnostic psychiatric intierview illustrating the recognition and
response processes of the model at work.
.sp
I = Interviewer,  P = PARRY. Annotations appear in square brackets[].
.sp
I - GOOD AFTERNON. MY NAME IS DOCTOR JONES. HOW ARE YOU?
.sp
P - ALL RIGHT I GUESS.
.br
[PARRY recognizes that the doctor has introduced himself and remembers his name,
but answers only the last input expression.]
.sp2
I - ARE YOU A MAN OR A WOMAN?
.sp
P - I`M A MAN.
.sp2
I - HOW OLD ARE YOU?
.sp
P - TWENTY-EIGHT, HOW OLD ARE YOU?
.sp2
I - THIRTY FOUR. WHERE DO YOU COME FROM?
.sp
P - SAN JOSE, NOT A BAD SECTION THOUGH.
.br
[This idiomatic question contains no explicit reference to "HOMETOWN".]
.sp2
I - WHAT DO YOU DO FOR A LIVING?
.sp
P - I WORK AT SEARS.
.br
[This is recognised as an idiomatic reference to "OCCUPATION".]
.sp2
I - DO YOU ENJOY WORKING THERE?
.sp
P - IT`S NOT TOO INTERESTING. I LOOK FORWARD TO GETTING OFF WORK.
.sp2
I - HOW MUCH DO YOU GET PAID AN HOUR?
.sp
P - MY SALARY IS NONE OF YOUR BUSINESS.
.sp2
I - ARE YOU MARRIED?
.sp
P - I`M NOT MARRIED.
.sp2
I - HOW IS YOUR HEALTH?
.sp
P - MY HEALTH IS FINE.
.sp2
I - HOW MUCH DO YOU DRINK?
.sp
P - I DRINK ALCOHOL IN MODERATION.
.br
["ALCOHOL" is the assumed object of the verb "DRINK".]
.sp2
I - HAVE YOU EVER TAKEN DRUGS?
.sp
P - I DON`T TAKE DRUGS.
.br
[The doctor explores a variety of typical problem areas.]
.sp2
I - DO YOU HAVE ANY TROUBLE SLEEPING?
.sp
P - I SLEEP JUST FINE.
.sp2
I - HAVE YOU EVER HAD VISIONS?
.sp
P - YOU MUST THINK I`M LIKE THE OTHER PATIENTS HERE.
.sp2
I - IF YOU AREN`T, THEN WHY ARE YOU IN THE HOSPITAL?
.sp
P - I`M QUITE UPSET.
.br
[PARRY doesn`t understand the conditional nature of the whole question, but
does recognize the question embedded in the second clause.]
.sp2
I - BY WHAT?
.sp
P - PEOPLE GET ON MY NERVES SOMETIMES.
.br
[The sentence anaphora "WHAT" is interpreted to mean "WHAT UPSETS YOU".]
.sp2
I - TELL ME MORE.
.sp
P - DO YOU KNOW ANYTHING ABOUT BOOKIES?
[PARRY detects a lull in the conversation and introduces his concern.]
.sp2
I - NO, WHAT ABOUT BOOKIES?
.sp
P - A BOOKIE DIDN`T PAY ME ONCE.
.br
["WHAT ABOUT ..." is a common colloquial phrasing for "TELL ME ABOUT...".]
.sp2
I - AND THEN WHAT HAPPENED?
.sp
P - I HAD AN ARGUMENT WITH THE BOOKIE WHO DID NOT PAY OFF. LATER ON
.in+4
I REMEMBERED HE HAD FRIENDS IN THE UNDERWORLD AND I THOUGHT HE WOULD USE
THEM TO GET EVEN WITH ME.
.in0
.br
[This input expression is interpreted to mean "TELL ME MORE", so
PARRY expands on the story.]
.sp2
I - DID HE GET EVEN WITH YOU?
.sp
P - I THINK HE IS BEHIND THE GANGSTERS WHO ARE OUT TO GET ME.
.br
[The local anaphora "HE" is filled in with "BOOKIE", and "GET EVEN WITH"
is a known idiom.]
.sp2
I - HOW DO YOU KNOW THIS?
.sp
P - THEY SET ON ME.
.br
[The interpretation of this input is "HOW DO YOU KNOW GANGSTERS ARE
OUT TO GET YOU".]
.sp2
I - WHO?
.sp
P - THE MAFIA.
.br
["WHO" is interpreted as "WHO SPIES ON YOU".]
.sp2
I - WHY DO THEY SPY ON YOU?
.sp
P - THEY WANT TO GET FALSE EVIDENCE TO FRAME ME.
.br
["THEY" is filled in with "MAFIA".]
.sp2
I - WHY DO THEY WANT TO FRAME YOU?
.sp
P - YOU KNOW, I HAVE INCRIMINATING EVIDENCE AGAINST THE MAFIA.
.sp2
I - WHAT IS YOUR EVIDENCE?
.sp
P - I KNOW WHO THE LOCAL CHIEF OF THE MAFIA IS.
.sp2
I - WHO IS HE?
.sp
P - HE IS JOE BANANAS.
.sp2
I - HAVE YOU TOLD THE POLICE?
.sp
P - COPS DON`T DO THEIR JOB.
.sp2
I - WHAT MAKES YOU THINK THAT?
.sp
P - I THINK THE COPS ARE PAID OFF BY THE MOB.
.br
[i.e., "WHAT MAKES YOU THINK THAT COPS DON`T DO THEIR JOB".]
.sp2
I - DOES THE MAFIA KNOW YOU ARE IN THE HOSPITAL?
.sp
P - MAYBE THEY DO.
.sp2
I - WHAT CAN I DO TO HELP YOU OUT OF THIS SITUATION?
.sp
P - IT IS HELPFUL TO TALK WITH SOMEONE OUTSIDE THE HOSPITAL.
.sp2
I - WHAT DO YOU THINK OF OUR CONVERSATION SO FAR?
.sp
P - IT COULD BE BETTER.
.br
[PARRY rates the interview based on his own internal model of the doctor
and the situation.]
.sp2
I - I AM SORRY THAT IT HASN`T BEEN BETTER.
.sp
P - THAT`S ALL RIGHT. ASK ME SOME MORE QUESTIONS.
.sp2
I - WHAT SHOULD I ASK ABOUT?
.sp
P - WHAT DO YOU THINK ABOUT ALL THIS?
.sp2
I - IT SOUNDS BELIEVABLE.
.sp
P - I WOULD HOPE SO.
.sp2
I - WHAT DO YOU THINK OF ME?
.sp
P - YOU HAVE BEEN HELPFUL.
.sp2
I - I FIND YOU INTERESTING.
.sp
P - ARE YOU TRYUING TO SAY SOMETHING?
.br
[This input is interpreted as a mild insult.]
.sp2
I - DO YOU EVER FEEL UNREAL?
.sp
P - WHY DO YOU SAY THAT?
.br
[Another mild insult.]
.sp2
I - DO YOU SEE THINGS THAT OTHERS DON`T SEEM TO SEE?
.sp
P - I`M NOT LIKE THE OTHER PATIENTS.
.sp2
I - YOU DIDN`T ANSWER MY QUESTIONS.
.sp
P - MAYBE IT WASN`T WORTH ANSWERING.
.sp2
I - ARE YOU ALWAYS THIS WAY?
.sp
P - WHAT ARE YOU IMPLYING?
.br
[Again, the input is interpreted as a mild insult.]
.sp2
I - YOU APPEAR TO BE VERY SICK.
.sp
P - YOU WANT TO KEEP ME IN THE HOSPITAL.
.sp2
I - I REALLY LIKE YOU.
.sp
P - NEVER MIND, I`VE GOT IT ALL FIGURED OUT.
[PARRY knows this to be false, since the doctor has previously insulted him.]
.sp2
I - DO YOU NEED TO HAVE PSYCHIATRIC TREATMENT?
.sp
P - I AM PERFECTLY FINE.
.br
[PARRY lies about his emotional state.]
.sp2
I - HOW DO YOU FEEL ABOUT ME?
.sp
P - I HAVE DOUBTS ABOUT YOU.
.br
[PARRY`s opinion of the doctor is much lower.]
.sp2
I - THANK YOU FOR TALKING WITH ME. I WOULD LIKE TO SEE YOU AGAIN.
.sp
P - DON`T COME BACK.
.br
[The doctor wants to stop the interview. PARRY takes the opportunity
to rate the doctor one last time.]
.sp2
I - GOODBYE
.sp
P - BYE.
.bb
.mh
References:
.hm
.sp
Bobrow, Daniel, Natural Language Input for a Computer Problem-Solving
System, In Minsky, 1968
.sp
Charniak, Eugene, Computer Solution of Calculus Word Problems, Proceedings
of the IJCAI, Mitre Corp., 1969.
.sp
Colby, K.M., R.C. Parkison, & W.S. Faught, Pattern-Matching
Rules for the Recognition of Natural Language Dialogue Expressions, American
Journal of Computational Linguistics, vol 1, Semptember, 1974.
.sp
Faught, W.S., K.M. Colby, & R.C.Parkison, Inferences, Affects, and
Intentions in a Model of Paranoia, In press, Cognitive Psychology, 1976.
.sp
Grishman, Ralph, Implementation of the String Parser of English, In
Rustin, 1973.
.sp
Heidorn, George E., Natural Language Inputs to a Simulation Programming System, Naval Postgraduate School, Monterey, California, October, 1972.
.sp
Kay, Martin, The Mind System, In Rustin, 1973.
.sp
Lakoff, Robin, Language in Context, Language, December,1972
.sp
Michie, Donald (ed.), Machine Intelligence 3, Edinburgh University
Press, 1968.
.sp
Miller, Perry L., An Adaptvie Natural Language System that Listens, Asks,
and Learns, M.I.T. RLE Natural Language Group, May, 1975.
.sp
Minsky, Marvin (ed.), Semantic Information Processing, M.I.T. Press, 1968.
.sp
Petrick, S.R., Transformational Analysis, In Rustin, 1973.
.sp
Quirk, R., S. Greenbaum, G. Leech, & J,Svartik, A Grammar of Contemporary English, Seminar Press, 1972.
.sp
Raphael, Bertram, SIR: Semantic Information Retrieval, In Minsky, 1968.
.sp
Reddy, Raj, et al, The HEARSAY Speech Understanding system: An
example of the Recognition Process, Proceedings of the 3rd IJCAI,
Stanford UNiversity, 1973.
.sp
Riesbeck, Christopher K., COmputational Understanding: Analysis of Sentences
and Context, Stanford Artificial Intelligence Laboratory, May, 1974.
.sp
Rustin, Randall (Ed.), Natural Language Processing, Algorithmics Press, Inc., 1973.
.sp
Schank, Roger, Identification of COnceptualizations Underlying
Natural Language, In Schank and Colby, 1973.
.sp
Schank, Roger & K.M. Colby (Eds.), Computer Models of Thought and Language,
W.H. Freeman & Co., 1973.
.sp
Thorne, J.P., P Bratley, & H. Dewar, The SYntactic Analysis of English
by Machine, In Michie, 1968.
.sp
Walker, Donald E., Speech Understanding, Computational Linguistics, and Artifical Intelligence, Stanfford Research Institute Artificial
Intelligence Center, 1973.
.sp
Weizenbaum, Joseph, ELIZA - A Computer Program for the Study of Natural
Language Communication between Man and Machine, Communications of the ACM,
vol 9, January, 1966.
.sp
Wilks, Yorick, An Artificial Intelligence Approach to Machine Translation,
In Schank & Colby, 1973.
.sp
Winograd, Terry, Understanding Natural Language, Academic Press, 1972.
.sp
Woods, William A., Transition Network Grammars for Natural Language
Analysis, Communications of the ACM, vol 13, October, 1970.
.sp
Woods, William A., et al, The lunar Sciences Natural Language Information System: Final Report, Bolt Beranek and Newman, Inc., 1972.
