textfiles/programming/AI/ai198901.txt

3045 lines
125 KiB
Plaintext
Raw Normal View History

2021-04-15 11:31:59 -07:00
Research Report AI-1989-01
Artificial Intelligence Programs
The University of Georgia
Athens, Georgia 30602
Available by ftp from
aisun1.ai.uga.edu
(128.192.12.9)
Series editor:
Michael Covington
mcovingt@aisun1.ai.uga.edu
GULP 2.0: An Extension of
Prolog for Unification-Based Grammar
Michael A. Covington
Advanced Computational Methods Center
University of Georgia
Athens, Georgia 30602
January 1989
ABSTRACT: A simple extension to Prolog facilitates
implementation of unification-based grammars (UBGs) by
adding a new notational device, the feature structure,
whose behavior emulates graph unification. For
example, a:b..c:d denotes a feature structure in
which a has the value b, c has the value d, and the
values of all other features are unspecified. A
modified Prolog interpreter translates feature
structures into Prolog terms that unify in the desired
way. Thus, the extension is purely syntactic,
analogous to the automatic translation of "abc" to
[97,98,99] in Edinburgh Prolog.
The extended language is known as GULP (Graph
Unification Logic Programming); it is as powerful and
concise as PATR-II (Shieber 1986a,b) and other grammar
development tools, while retaining all the versatility
of Prolog. GULP can be used with grammar rule notation
(DCGs) or any other parser that the programmer cares
to implement.
Besides its uses in natural language processing,
GULP provides a way to supply keyword arguments to any
procedure.
1. Introduction
A number of software tools have been developed for
implementing unification-based grammars, among them PATR-II
(Shieber 1986a,b), D-PATR (Karttunen 1986a), PrAtt (Johnson and
Klein 1986), and AVAG (Sedogbo 1986). This paper describes a
simple extension to the syntax of Prolog that serves the same
purpose while making a much less radical change to the language.
Unlike PATR-II and similar systems, this system treats feature
structures as first-class objects that appear in any context, not
just in equations.1 Further, feature structures can be used not
1 The first version of GULP (Covington 1987) was
developed with support from National Science Foundation
Grant IST-85-02477. I want to thank Franz Guenthner,
Rainer B<>uerle, and the other researchers at the
2
only in natural language processing, but also to pass keyword
arguments to any procedure.
The extension is known as GULP (Graph Unification Logic
Programming). It allows the programmer to write a:b..c:d to
stand for a feature structure in which feature a has the value b,
feature c has the value d, and all other features are
uninstantiated.2 The interpreter translates feature structures
written in this notation into ordinary Prolog terms that unify in
the desired way. Thus, this extension is similar in spirit to
syntactic devices already in the language, such as writing "abc"
for [97,98,99] or writing [a,b,c] for .(a,.(b,.(c,nil))).
GULP can be used with grammar rule notation (definite clause
grammars, DCGs) or with any parser that the programmer cares to
implement in Prolog.
2. What is unification-based grammar?
2.1. Unification-based theories
Unification-based grammar (UBG) comprises all theories of
grammar in which unification (merging) of feature structures plays
a prominent role. As such, UBG is not a theory of grammar but
rather a formalism in which theories of grammar can be expressed.
Such theories include functional unification grammar, lexical-
functional grammar (Kaplan and Bresnan 1982), generalized phrase
structure grammar (Gazdar et al. 1986), head-driven phrase
structure grammar (Pollard and Sag 1987), and others.
UBGs use context-free grammar rules in which the nonterminal
symbols are accompanied by sets of features. The addition of
features increases the power of the grammar so that it is no
longer context-free; indeed, in the worst case, parsing with such
a grammar can be NP-complete (Barton, Berwick, and Ristad 1987:93-
96).
However, in practice, these intractable cases are rare.
Theorists restrain their use of features so that the grammars, if
not actually context-free, are close to it, and context-free
Seminar f<>r nat<61>rlich-sprachliche Systeme, University
of T<>bingen, for their hospitality and for helpful
discussions. The opinions and conclusions expressed
here are solely those of the author.
2 This use of the colon makes the Quintus Prolog
and Arity Prolog module systems unavailable; so far,
this has not caused problems.
3
parsing techniques are successful and efficient. Joshi (1986) has
described this class of grammars as "mildly context-sensitive."
2.2. Grammatical features
Grammarians have observed since ancient times that each word
in a sentence has a set of attributes, or features, that determine
its function and restrict its usage. Thus:
The dog barks.
category:determiner category:noun category:verb
number:singular number:singular
person:3rd
tense:present
The earliest generative grammars of Chomsky (1957) and others
ignored all of these features except category, generating
sentences with context-free phrase-structure rules such as
sentence --> noun phrase + verb phrase
noun phrase --> determiner + noun
plus transformational rules that rearranged syntactic structure.
Syntactic structure was described by tree diagrams (Figure 1).3
Number and tense markers were treated as separate elements of the
string (e.g., boys = boy + s). "Subcategorization" distinctions,
such as the fact that some verbs take objects and other verbs do
not, were handled by splitting a single category, such as verb,
into two (verbtransitive and verbintransitive).
But complex, cross-cutting combinations of features cannot
be handled in this way, and Chomsky (1965) eventually attached
feature bundles to all the nodes in the tree (Figure 2). His
contemporaries accounted for grammatical agreement (e.g., the
agreement of the number features of subject and verb) by means of
transformations that copied features from one node to another.
This remained the standard account of grammatical agreement for
many years.
But feature copying is unnecessarily procedural. It
presumes, unjustifiably, that whenever two nodes agree, one of
them is the source and the other is the destination of a copied
feature. In practice, the source and destination are hard to
3 Figures are printed at the end of this document.
4
distinguish. Do singular subjects require singular verbs, or do
singular verbs require singular subjects? This is an empirically
meaningless question. Moreover, when agreement processes interact
to combine features from a number of nodes, the need to
distinguish source from destination introduces unnecessary
clumsiness.
2.3. Unification-based grammar
Unification-based grammar attacks the same problem non-
procedurally, by stating constraints on feature values. For
example, the rule
[2.3a] PP --> P NP
[case:acc]
says that in a prepositional phrase, the NP must be in the
accusative case.
More precisely, rule [2.3a] says the feature structure
[case:acc]
must be unified (merged) with whatever features the NP already
has. If the NP already has case:acc, all is well. If the NP has
no value for case, it acquires case:acc. But if the NP has
case with some value other than acc, the unification fails and
the rule cannot apply.
Agreement is handled with variables, as in the rule
[2.3b] S --> NP VP
<20> <20> <20> <20>
<20> person:X <20> <20> person:X <20>
<20> number:Y <20> <20> number:Y <20>
<20> <20> <20> <20>
which requires the NP and VP to agree in person and number. Here
X and Y are variables; person:X merges with the person
feature of both the NP and the VP, thereby ensuring that the same
value is present in both places. The same thing happens with
number.
Strictly speaking, the category label (S, NP, VP, etc.) is
part of the feature structure. Thus,
NP
[case:acc]
is short for:
5
<20> <20>
<20> category:NP <20>
<20> case:acc <20>
<20> <20>
In practice, however, the category label usually plays a primary
role in parsing, and it is convenient to give it a special status.
Grammar rules can alternatively be written in terms of
equations that the feature values must satisfy. In equational
notation, rules [2.3a] and [2.3b] become:
[2.3c] PP --> P NP NP case = acc
[2.3d] S --> NP VP NP person = VP person
NP number = VP number
or even, if the category label is to be treated as a feature,
[2.3e] X --> Y Z X category = PP
Y category = P
Z category = NP
Z case = acc
[2.3f] X --> Y Z X category = S
Y category = NP
Z category = VP
Y person = Z person
Y number = Z number
where X, Y, and Z are variables. Equations are used in PATR-II,
PrAtt, and other implementation tools, but not in the system
described here.
The value of a feature can itself be a feature structure.
This makes it possible to group features together to express
generalizations. For instance, one can group syntactic and
semantic features together, creating structures such as:
<20> <20> <20> <20>
<20> syn:<3A> case:acc <20> <20>
<20> <20> gender:masc <20> <20>
<20> <20> <20> <20>
<20> <20> <20> <20>
<20> sem:<3A> pred:MAN <20> <20>
<20> <20> countable:yes <20> <20>
<20> <20> animate:yes <20> <20>
<20> <20> <20> <20>
Then a rule can copy the syntactic or semantic features en masse
to another node, without enumerating them.
6
2.4. A sample grammar
Features provide a powerful way to pass information from one
place to another in a grammatical description. The grammar in
Figure 3 is an example. It uses features not only to ensure the
grammaticality of the sentences generated, but also to build a
representation of the meaning of the sentence. Every constituent
has a sem feature representing its meaning. The rules combine the
meanings of the individual words into predicate-argument
structures representing the meanings of all of the constituents.
The meaning of the sentence is represented by the sem feature of
the topmost S node.
Like all the examples given here, this grammar is intended
only as a demonstration of the power of unification-based grammar,
not as a viable linguistic analysis. Thus, for simplicity, the
internal structure of the NP is ignored and the proposal to group
syntactic features together is abandoned.
To see how the grammar works, consider how the sentence Max
sees Bill would be parsed bottom-up. The process is shown in
Figure 4. First rules [c], [d], and [e] supply the features of the
individual words (Figure 4a). Next the bottom-up parser attempts
to build constituents.
By rule [b], sees and Bill constitute a VP (Figure 4b). At
this step, construction of a semantic representation begins. The
sem feature of the VP has as its value another feature structure
which contains two features: pred, the semantics of the verb, and
arg2, the semantics of the direct object.
Rule [b] also assigns the feature case:acc to Bill; this
has no effect on the form of the noun but would be important if a
pronoun had been used instead.
Finally, rule [a] allows the resulting NP and VP to be
grouped together into an S (Figure 4c). This rule assigns
nominative case to Max and combines the semantics of the NP and VP
to construct the sem feature of the S node, thereby accounting
for the meaning of the complete sentence.
It would be equally feasible to parse top-down. Parsing
would then begin with an S node, expanded to NP and VP by rule
[a]. The NP would then expand to Max using rule [d], thereby
supplying a value for sem of NP, and hence also for sem:arg1 of
S. Similarly, expansion of the VP would supply values for the
remaining features of S.
Crucially, it is possible (and necessary) to match variables
with each other before giving them values. In a top-down parse, we
know that sem:arg2 of S will have the same value as sem:arg2
of VP long before we know what this value is to be.
7
2.5. Functions, paths, re-entrancy, and graphs
A feature can be viewed as a partial function which, given a
feature structure, may or may not yield a value. For instance,
given the structure
<20> <20> <20> <20>
<20> syn:<3A> case:acc <20> <20>
<20> <20> gender:masc <20> <20>
<20> <20> <20> <20>
<20> sem: MAN <20>
<20> <20>
the feature sem yields the value MAN, the feature syn yields
another feature structure, and the feature tense yields no value
(it is a case in which the partial function is undefined).
A path is a series of features that pick out an element of a
nested feature structure. Formally, a path is the composition of
the functions just mentioned. For example, the path syn:case is
what you get by applying the function case to the value of the
function syn; applied to the structure above, syn:case yields
the value acc. Path notation provides a way to refer to a single
feature deep in a nested structure without writing nested
brackets. Thus one can write rules such as
P --> P NP
[syn:case:acc]
or, in equational form,
P --> P NP NP syn case = acc
Feature structures are re-entrant. This means that features
are like pointers; if two of them have the same value, then they
point to the same object, not to two similar objects. If this
object is subsequently modified (e.g., by giving values to
variables), the change will show up in both places. Thus the
structure
<20> <20>
<20> a:b <20>
<20> c:b <20>
<20> e:d <20>
<20> <20>
is more accurately represented as something like:
8
<20> <20>
<20> a <20><><EFBFBD><EFBFBD><EFBFBD> b <20>
<20> c <20><><EFBFBD> <20>
<20> e <20><><EFBFBD><EFBFBD><EFBFBD> d <20>
<20> <20>
There is only one b, and a and c both point to it.
Re-entrant feature structures can be formelized as directed
acyclic graphs (DAGs) as shown in Figure 5. Features are arcs and
feature values are the vertices or subgraphs found at the ends of
the arcs. A path is a series of arcs chained together.
Re-entrancy follows from the way variables behave in a
grammar. All occurrences of the same variable take on the same
value at the same time. (As in Prolog, like-named variables in
separate rules are not considered to be the same variable.) The
value may itself contain a variable that will later get a value
from somewhere else. This is why bottom-up and top-down parsing
work equally well.
2.6. Unification
Our sample grammar relies on the merging of partially
specified feature structures. Thus, the subject of the sentence
gets case from rule [a] and semantics from rule [d] or [e]. This
merging can be formalized as unification. The unifier of two
feature structures A and B is the smallest feature structure C
that contains all the information in both A and B.
Feature structure unification is equivalent to graph
unification, i.e., merging of directed acyclic graphs, as defined
in graph theory. The unifier of two graphs is the smallest graph
that contains all the nodes and arcs in the graphs being unified.
This is similar but not identical to Prolog term unification;
crucially, elements of the structure are identified only by name,
not (as in Prolog) by position.
Formally, the unification of feature structures A and B
(giving C) is defined as follows:
(1) Any feature that occurs in A but not B, or in B but not
A, also occurs in C with the same value.
(2) Any feature that occurs in both A and B also occurs in
C, and its value in C is the unifier of its values in A and
B.
Feature values, in turn, are unified as follows:
9
(a) If both values are atomic symbols, they must be the same
atomic symbol, or else the unification fails (the unifier
does not exist).
(b) A variable unifies with any object by becoming that
object. All occurrences of that variable henceforth
represent the object with which the variable has unified.
Two variables can unify with each other, in which case they
become the same variable.
(c) If both values are feature structures, they unify by
applying this process recursively.
Thus
<20> <20> <20> <20>
<20> a:b <20> and <20> c:d <20>
<20> c:d <20> <20> e:f <20>
<20> <20> <20> <20>
unify giving:
<20> <20>
<20> a:b <20>
<20> c:d <20>
<20> e:f <20>
<20> <20>
Likewise, [a:X] and [a:b] unify, instantiating X to the value
b; and
<20> <20> <20> <20>
<20> a:X <20> and <20> a:c <20>
<20> b:c <20> <20> b:Y <20>
<20> <20> <20> <20>
unify by instantiating both X and Y to c.
As in Prolog, unification is not always possible.
Specifically, if A and B have different (non-unifiable) values for
some feature, unification fails. A grammar rule requiring A to
unify with B cannot apply if A and B are not unifiable.
Unification-based grammars rely on failure of unification to
rule out ungrammatical sentences. Consider, for example, why our
sample grammar generates Max sees me but not Me sees Max. In Max
sees me, both rule [b] and rule [f] specify that me has the
feature case:acc, giving the structure shown in Figure 6.
However, in Me sees Max, the case of me raises a conflict.
Rule [a] specifies case:nom and rule [f] specifies case:acc.
10
These values are not unifiable; hence the specified merging of
feature structures cannot go through, and the sentence is not
generated by the grammar.
2.7. Declarativeness
Unification-based grammars are declarative, not procedural.
That is, they are statements of well-formedness conditions, not
procedures for generating or parsing sentences. That is why, for
example, sentences generated by our sample grammar can be parsed
either bottom-up or top-down.
This declarativeness comes from the fact that unification is
an order-independent operation. The unifier of A, B, and C is the
same regardless of the order in which the three structures are
combined. This is true of both graph unification and Prolog term
unification.
The declarative nature of UBGs is subject to two caveats.
First, although unification is order-independent, particular
parsing algorithms are not. Recall that grammar rules of the form
A --> A B
cannot be parsed top-down, because they lead to infinite loops
("To parse an A, parse an A and then..."). Now consider a rule of
the form
A --> A B
[f:X] [f:Y]
If X and Y have different values, then top-down parsing works
fine; if either X or Y does not have a value at the time the rule
is invoked, top-down parsing will lead to a loop. This shows that
one cannot simply give an arbitrary UBG to an arbitrary parser and
expectuseful results;the order ofinstantiation mustbe keptin mind.
Second, many common Prolog operations are not order-
independent, and this must be recognized in any implementation
that allows Prolog goals to be inserted into grammar rules.
Obviously, the cut (!) interferes with order-independence by
blocking alternatives that would otherwise succeed. More
commonplace predicates such as write, is, and == lack order-
independence because they behave differently depending on whether
their arguments are instantiated at the time of execution.
Colmerauer's Prolog II (Giannesini et al. 1986) avoids some of
these difficulties by allowing the programmer to postpone tests
until a variable becomes instantiated, whenever that may be.
2.8. Building structures and moving data
11
Declarative unification-based rules do more than just pass
information up and down the tree. They can build structure as they
go. For example, the rule
VP --> V NP
<20> <20> <20> <20> <20> <20> <20> <20>
<20> sem:<3A> pred:X <20> <20> <20> sem:X <20> <20> sem:Y <20>
<20> <20> arg:Y <20> <20> <20> <20> <20> <20>
<20> <20> <20> <20>
builds on the VP node a pred-arg structure that is absent on the V
and NP.
Unification can pass information around in directions other
than along the lines of the tree diagram. This is done by
splitting a feature into two sub-features, one for input and the
other for output. The inputs and outputs can then be strung
together in any manner.
Consider for example the rule:
S --> NP VP
<20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20>
<20> sem:<3A> in:X1 <20> <20> <20> sem:<3A> in:X1 <20> <20> <20> sem:<3A> in:X2 <20> <20>
<20> <20> out:X3 <20> <20> <20> <20> out:X2 <20> <20> <20> <20> out:X3 <20> <20>
<20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20> <20>
This rule assumes that sem of the S has some initial value
(perhaps an empty list) which is passed into X1 from outside. X1
is then passed to the NP, which modifies it in some way, giving
X2, which is passed to the VP for further modification. The
output of the VP is X3, which becomes the output of the S.
Such a rule is still declarative and can work either forward
or backward; that is, parsing can still take place top-down or
bottom-up. Further, any node in the tree can communicate with any
other node via a string of input and output features, some of
which simply pass information along unchanged. The example in
section 4.2 below uses input and output features to undo unbounded
movements of words. Johnson and Klein (1985, 1986) use in and out
features to perform complex manipulations of semantic structure;
see section 4.3 (below) for a GULP reconstruction of part of one
of their programs.
3. The GULP translator
3.1. Feature structures in GULP
12
The key idea of GULP is that feature structures can be
included in Prolog programs as ordinary data items. For instance,
the feature structure
<20> <20>
<20> a:b <20>
<20> c:d <20>
<20> <20>
is written:
a:b..c:d
and GULP translates a:b..c:d into an internal representation
(called a value list) in which the a position is occupied by b,
the c position is occupied by d, and all other positions, if any,
are uninstantiated.
This is analogous to the way ordinary Prolog translates
strings such as "abc" into lists of ASCII codes. The GULP
programmer always uses feature structure notation and never deals
directly with value lists. Feature structures are order-
independent; the translations of a:b..c:d and of c:d..a:b are
the same.
Nesting and paths are permitted. Thus, the structure
<20> <20>
<20> a:b <20>
<20> <20> <20> <20>
<20> c:<3A> d:e <20> <20>
<20> <20> f:g <20> <20>
<20> <20> <20> <20>
is written a:b..c:(d:e..f:g).4 The same structure can be
written as
<20> <20>
<20> a:b <20>
<20> c:d:e <20>
<20> c:f:g <20>
<20> <20>
which GULP renders as a:b..c:d:e..c:f:g.
GULP feature structures are data items -- complex terms --
not statements or operations. They are most commonly used as
arguments. Thus, the rule
4 Arity Prolog 4.0 requires a space before the
'('.
13
S --> NP VP
<20> <20> <20> <20> <20> <20>
<20> person:X <20> <20> person:X <20> <20> person:X <20>
<20> number:Y <20> <20> number:Y <20> <20> number:Y <20>
<20> <20> <20> <20> <20> <20>
can be written in DCG notation, using GULP, as:
s(person:X..number:Y) -->
np(person:X..number:Y),
vp(person:X..number:Y).
They can also be processed by ordinary Prolog predicates. For
example, the predicate
nonplural(number:X) :- nonvar(X), X \= plural.
succeeds if and only if its argument is a feature structure whose
number feature is instantiated to some value other than plural.
Any feature structure unifies with any other feature
structure unless prevented by conflicting values. Thus, the
internal representations of a:b..c:d and c:d..e:f unify,
giving a:b..c:d..e:f. But a:b does not unify with a:d because
b and d do not unify with each other.
3.2. GULP syntax
Formally, GULP adds to Prolog the operators `:' and `..'
and a wide range of built-in predicates. The operator `:' joins a
feature to its value, which itself can be another feature
structure. Thus in c:d:e, the value of c is d:e.
A feature-value pair is the simplest kind of feature
structure. The operator `..' combines feature-value pairs to
build more complex feature structures.5 This is done by simply
unifying them. For example, the internal representation of
a:b..c:d is built by unifying the internal representations of
a:b and c:d.
This fact can be exploited to write "improperly nested"
feature structures. For example,
a:b..c:X..c:d:Y..Z
denotes a feature structure in which:
5 For compatibility with earlier versions, `..'
can also be written `::'.
14
the value of a is b,
the value of c unifies with X,
the value of c also unifies with d:Y, and
the whole structure unifies with Z.
Both operators, `:' and `..', are right-associative; that
is, a:b:c = a:(b:c) and A..B..C = A..(B..C). Arity
Prolog 4.0 requires an intervening space when `:' or `..' occurs
adjacent to a left parenthesis; other Prologs lack this
restriction.
3.3. Built-in predicates
GULP 2.0 is an ordinary Prolog environment with some built-
in predicates added. The most important of these is load, which
loads clauses into memory through the GULP translator. (A
consult or reconsult would not translate feature structures
into their internal representations.) Thus,
?- load myprog.
loads clauses from the file MYPROG.GLP.
Like reconsult, load clears away any pre-existing clauses
for a predicate when new clauses for that predicate (with the same
arity) are first encountered in a file. However, load does not
require the clauses for a predicate to be contiguous, so long as
they all occur in the same file. A program can consist of several
files that are loaded into memory together.
Another predicate, ed, calls a full-screen editor and then
loads the file. Without an argument, ed or load uses the same
file name as on the most recent invocation of either ed or load.
Other special predicates are used within the program. GULP
1.1 required a declaration such as
g_features([gender,number,case,person,tense]).
declaring all feature names before any were used. This declaration
is optional in GULP 2.0. If present, it establishes the order in
which features will appear whenever a feature structure is output,
and it can be used to optimize the program by putting frequently
used features at the beginning. Further, whether or not the
programmer includes a g_features declaration, GULP 2.0 maintains
in memory an up-to-date g_features clause with a list of all the
features actually used, in the order in which they were
encountered.
15
The predicate g_translate/2 interconverts feature
structures and their internal representations. This makes it
possible to process, at runtime, feature structures in GULP
notation rather than translated form. For instance, if X is a
feature structure, then g_translate(Y,X), write(Y) will
display it in GULP notation.
The predicate display_feature_structure outputs a
feature structure, not in GULP notation, but in a convenient
tabular format, thus:
syn: case: acc
gender: masc
sem: pred: MAN
countable: yes
animate: yes
This is similar to traditional feature structure notation, but
without brackets.
3.4. Internal representation
The nature of value lists, which represent feature
structures internally, is best approached by a series of
approximations. The nearest Prolog equivalent to a feature
structure is a complex term with one position reserved for the
value of every feature. Thus
<20> <20>
<20> number:plural <20>
<20> person:third <20>
<20> gender:fem <20>
<20> <20>
could be represented as x(plural,third,fem) or
[plural,third,fem] or the like. It is necessary to decide in
advance which argument position corresponds to each feature.
A feature structure that does not use all of the available
features is equivalent to a term with anonymous variables; thus
<20> <20>
<20> person:third <20>
<20> <20>
would be represented as x(_,third,_) or [_,third,_].
Structures of this type simulate graph unification in the
desired way. They can be recursively embedded. Further,
structures built by instantiating Prolog variables are inherently
16
re-entrant, since an instantiated Prolog variable is actually a
pointer to the memory representation of its value.
All the feature structures in a program must be unifiable
unless they contain conflicting values. Accordingly, if fifteen
features are used in the program, every value list must reserve
positions for all fifteen. One option would be to represent value
lists as 15-argument structures:
tense:present => x(_,_,_,_,present,_,_,_,_,_,_,_,_,_,_)
This obviously wastes memory. A better solution would be to
use lists; a list with an uninstantiated tail unifies with any
longer list. The improved representation is:
tense:present => [_,_,_,_,present|_]
By putting frequently used features near the beginning, this
representation can save a considerable amount of memory as well as
reducing the time needed to do unifications. Further, lists with
uninstantiated tails gain length automatically as further elements
are filled in; unifying [a,b,c|_] with [_,_,_,_,e|_] gives
[a,b,c,_,e|_].
If most of the lists in the program have uninstantiated
tails, the program can be simplified by requiring all lists to
have uninstantiated tails. Any process that searches through a
list will then need to check for only one terminating condition
(remainder of list uninstantiated) rather than two (remainder of
list uninstantiated or empty).
But the GULP internal value list structure is not an
ordinary list. If it were, translated feature structures would be
confused with ordinary Prolog lists, and programmers would fall
victim to unforeseen unifications. It would also be impossible to
test whether a term is a value list.
Recall that Prolog lists are held together by the functor
`.'. That is,
[a,b,c|X] = .(a,.(b,.(c,X)))
To get a distinct type of list, all we need to do is substitute
another functor for the dot. GULP uses g_/2. (In fact, all
functors beginning with g_ are reserved by GULP for internal
use.) So if tense is the fifth feature in the canonical order,
then
tense:present => g_(_,g_(_,g_(_,g_(_,g_(present,_)))))
It doesn't matter that this looks ugly; the GULP programmer never
sees it.
17
One more refinement (absent before GULP version 2.0) is
needed. We want to be able to translate value lists back into
feature structure notation. For this purpose we must distinguish
features that are unmentioned from features that are merely
uninstantiated. That is, we do not want tense:X to turn into an
empty feature structure just because X is uninstantiated. It may
be useful to know, during program testing, that X has unified
with some other variable even if it has not acquired a value.
Thus, we want to record, somehow, that the variable X was
mentioned in the original feature structure whereas the values of
other features (person, number, etc.) were not.
Accordingly, g_/1 (distinct from g_/2) is used to mark all
features that were mentioned in the original structure. If
person is second in the canonical order, and tense is fifth in
the canonical order (as before), then
tense:present..person:X =>
g_(_,g_(g_(X),g_(_,g_(_,g_(g_(present),_)
And this is the representation actually used by GULP. Note that
the use of g_/1 does not interfere with unification, because
g_(present) will unify both with g_(Y) (an explicitly
mentioned variable) and with an empty position.
3.5. How translation is done
GULP loads a program by reading it, one term a a time, from
the input file, and translating all the feature structures in each
term into value lists. The term is then passed to the built-in
predicate expand_term, which translates grammar rule (DCG)
notation into plain Prolog. The result is then asserted into the
knowledge base. There are two exceptions: a term that begins with
`:-' is executed immediately, just as in ordinary Prolog, and a
g_features declaration is given special treatment to be
described below.
To make translation possible, GULP maintains a stored set of
forward translation schemas, plus one backward schema. For
example, a program that uses the features a, b, and c
(encountered in that order) will result in the creation of the
schemas:
g_forward_schema(a,X,g_(X,_)).
g_forward_schema(b,X,g_(_,g_(X,_))).
g_forward_schema(c,X,g_(_,g_(_,g_(X,_)))).
g_backward_schema(a:X..b:Y..c:Z,g_(X,g_(Y,g_(Z,_)))).
Each forward schema contains a feature name, a variable for the
feature value, and the minimal corresponding value list. To
18
translate the feature structure a:xx..b:yy..c:zz, GULP will
mark each of the feature values with g_(...), and then call, in
succession,
g_forward_schema(a,g_(xx), ... ),
g_forward_schema(b,g_(yy), ... ),
g_forward_schema(c,g_(zz), ... ) ...
and unify the resulting value lists. The result will be the same
regardless of the order in which the calls are made. To translate
a complex Prolog term, GULP first converts it into a list using
`=..', then recursively translates all the elements of the list
except the first, then converts the result back into a term.
Backward translation is easier; GULP simply unifies the
value list with the second argument of g_backward_schema, and
the first argument immediately yields a rough translation. It is
rough in two ways: it mentions all the features in the grammar,
and it contains g_(...) marking all the feature values that were
mentioned in the original feature structure. The finished
translation is obtained by discarding all features whose values
are not marked by g_(...), and removing the g_(...) from values
that contain it.
The translation schemas are built automatically. Whenever a
new feature is encountered, a forward schema is built for it, and
the pre-existing backward schema, if any, is replaced by a new
one. A g_features declaration causes the immediate generation of
schemas for all the features in it, in the order given. In
addition, GULP maintains a current g_features clause at all
times that lists all the features actually encountered, whether or
not they were originally declared.
4. GULP in practical use
4.1. A simple definite clause grammar
Figure 7 shows the grammar from Figure 3 implemented with
the definite clause grammar (DCG) parser that is built into
Prolog. Each nonterminal symbol has a GULP feature structure as
its only argument.
Parsing is done top-down. The output of the program reflects
the feature structures built during parsing. For example:
?- test1.
[max,sees,bill] (String being parsed)
sem: pred: SEES (Displayed feature structure)
arg1: BILL
arg2: MAX
19
Figure 8 shows the same grammar written in a more PATR-like
style. Instead of using feature structures in argument positions,
this program uses variables for arguments, then unifies each
variable with appropriate feature structures as a separate
operation. This is slightly less efficient but can be easier to
read, particularly when the unifications to be performed are
complex.
In this program, the features of np and vp are called
NPfeatures and VPfeatures respectively. More commonly, the
features of np, vp, and so on are in variables called NP, VP,
and the like. Be careful not to confuse upper- and lower-case
symbols.
The rules in Figure 8 could equally well have been written
with the unifications before the constituents to be parsed. That
is, we can write either
s(Sfeatures) --> np(NPfeatures), vp(VPfeatures),
{ Sfeatures = ... }.
or
s(Sfeatures) --> { Sfeatures = ... },
np(NPfeatures), vp(VPfeatures).
Because unification is order-independent, the choice affects
efficiency but not correctness. The only exception is that some
rules can loop when written one way but not the other. Thus
s(S1) --> s(S2), { S1 = x:a, S2 = x:b }.
loops, whereas
s(S1) --> { S1 = x:a, S2 = x:b }, s(S2).
does not, because in the latter case S2 is instantiated to a
value that must be distinct from S1 before s(S2) is parsed.
4.2. A hold mechanism for unbounded movements
Unlike a phrase-structure grammar, a unification-based
grammar can handle unbounded movements. That is, it can parse
sentences in which some element appears to have been moved from
its normal position across an arbitrary amount of structure.
Such a movement occurs in English questions. The question-
word (who, what, or the like) always appears at the beginning of
the sentence. Within the sentence, one of the places where a noun
phrase could have appeared is empty:
20
The boy said the dog chased the cat.
What did the boy say _ chased the cat? (The dog.)
What did the boy say the dog chased _? (The cat.)
Ordinary phrase-structure rules cannot express the fact that only
one noun phrase is missing. Constituents introduced by phrase-
structure rules are either optional or obligatory. If noun phrases
are obligatory, they can't be missing at all, and if they are
optional, any number of them can be missing at the same time.
Chomsky (1957) analyzed such sentences by generating what in
the position of the missing noun phrase, then moving it to the
beginning of the sentence by means of a transformation. This is
the generally accepted analysis.
To parse such sentences, one must undo the movement. This is
achieved through a hold stack. On encountering what, the parser
does not parse it, but rather puts it on the stack and carries it
along until it is needed. Later, when a noun phrase is expected
but not found, the parser can pop what off the stack and use it.
The hold stack is a list to which elements can be added at
the beginning. Initially, its value is [] (the empty list). To
parse a sentence, the parser must:
(1) Pass the hold stack to the NP, which may add or remove
items.
(2) Pass the possibly modified stack to the VP, which may
modify it further.
In traditional notation, the rule we need is:
S --> NP VP
<20> <20><> <20><> <20> <20> <20><> <20> <20> <20><>
<20> hold:<3A> in: H1 <20><> <20> hold:<3A> in: H1 <20><> <20> hold:<3A> in: H2 <20><>
<20> <20> out: H3 <20><> <20> <20> out: H2 <20><> <20> <20> out: H3 <20><>
<20> <20> <20><> <20> <20> <20><> <20> <20> <20><>
Here hold:in is the stack before parsing a given constituent,
and hold:out is the stack after parsing that same constituent.
Notice that three different states of the stack -- H1, H2, and
H3 -- are allowed for.
Figure 9 shows a complete grammar built with rules of this
type. There are two rules expanding S. One is the one above (S -->
NP VP). The other one accepts what did at the beginning of the
sentence, places what on the stack, and proceeds to parse an NP
and VP. Somewhere in the NP or VP -- or in a subordinate S
embedded therein -- the parser will use the rule
np(NP) --> [], { NP = hold: (in:[what|H1]..out:H1) }.
21
thereby removing what from the stack.
4.3. Building complex semantic structures
Figure 10 shows a GULP reimplementation of a program by
Johnson and Klein (1986) that makes extensive use of in and out
features to pass information around the parse tree. Johnson and
Klein's key insight is that the logical structure of a sentence is
largely specified by the determiners. For instance, A man saw a
donkey expresses a simple proposition with universally quantified
variables, but Every man saw a donkey expresses an "if-then"
relationship (If X is a man then X saw a donkey). On the syntactic
level, every modifies only man, but semantically, it gives the
entire sentence a different structure.
Accordingly, Johnson and Klein construct their grammar so
that almost all the semantic structure is built by the
determiners. Each determiner must receive, from elsewhere in the
sentence, semantic representations for its scope and its
restrictor. The scope of a determiner is the main predicate of the
clause, and the restrictor is an additional condition imposed by
the NP to which the determiner belongs. For instance, in Every man
saw a donkey, the determiner every has scope saw a donkey and
restrictor man.
Figure 10 shows a reimplementation, in GULP, of a sample
program Johnson and Klein wrote in PrAtt (a different extension of
Prolog). The semantic representations built by this program are
those used in Discourse Representation Theory (Kamp, 1981;
Spencer-Smith, 1987). The meaning of a sentence or discourse is
represented by a discourse representation structure (DRS) such as:
[1,2,man(1),donkey(2),saw(1,2)]
Here 1 and 2 stand for entities (people or things), end man(1),
donkey(2), and saw(1,2) are conditions that these entities
must meet. The discourse is true if there are two entities such
that 1 is a man, 2 is a donkey, and 1 saw 2. In other words, "A
man saw a donkey." The order of the list elements is
insignificant, and the program builds the list backward, with
indices and conditions mixed together.
A DRS can contain other DRSes embedded in a variety of ways.
In particular, one of the conditions within a DRS can have the
form
DRS1 > DRS2
which means: "This condition is satisfied if for each set of
entities that satisfy DRS1, it is also possible to satisfy DRS2."
For example:
22
[1,man(1), [2,donkey(2)] > [saw(1,2)] ]
"There is an entity 1 such that 1 is a man, and for every entity 2
that is a donkey, 1 saw 2." That is, "Some man saw every donkey."
Again,
[ [1,man(1)] > [2,donkey(2)] ]
means "every man saw a donkey" -- that is, "for every entity 1
such that 1 is a man, there is an entity 2 which is a donkey."
Parsing a sentence begins with the rule:
s(S) --> { S = sem:A, NP = sem:A,
S = syn:B, VP = syn:B,
NP = sem:scope:C, VP = sem:C,
VP = syn:arg1:D, NP = syn:index:D }, np(NP), vp(VP).
This rule stipulates the following things:
(1) An S consists of an NP and a VP.
(2) The semantic representation of the S is the same as that of
the NP, i.e., is built by the rules that parse the NP.
(3) The syntactic feature structure (syn) of the S is that of the
NP. Crucially, this contains the indices of the subject (arg1)
and object (arg2).
(4) The scope of the NP (and hence of its determiner) is the
semantic representation of the VP.
(5) The index of the verb's subject (arg1) is that of the NP
mentioned in this rule.
Other rules do comparable amounts of work, and space precludes
explaining them in detail here. (See Johnson and Klein 1985, 1986
for further explanation.) By unifying appropriate in and out
features, the rules perform a complex computation in an order-
independent way.
4.4. Bottom-up parsing
GULP is not tied to Prolog's built-in DCG parser. It can be
used with any other parser implemented in Prolog. Figure 11 shows
23
how GULP can be used with the BUP bottom-up parser developed by
Matsumoto et al. (1986).6
In bottom-up parsing, the typical question is not "How do I
parse an NP?" but rather, "Now that I've parsed an NP, what do I
do with it?" BUP puts the Prolog search mechanism to good use in
answering questions like this.
During a BUP parse, two kinds of goals occur. A goal such as
?- np(s,NPf,Sf,[chased,the,cat],[]).
means: "An NP has just been accepted; its features are contained
in NPf. This occurred while looking for an S with features Sf.
Immediately after parsing the NP, the input string was
[chased,the,cat]. After parsing the S, it will be []."
The other type of goal is
?- goal(vp,VPf,[chased,the,cat],[]).
This means "Parse a VP with features VPf, starting with the input
string [chased,the,cat] and ending up with []." This is like
the DCG goal
?- vp(VPf,[chased,the,cat],[]).
except that the parsing is to be done bottom-up.
To see how these goals are constructed, imagine replacing
the top-down parsing rule
s --> np, vp.
with the bottom-up rule
np, vp --> s.
This rule should be used when the parser is looking for a rule
that will tell it how to use an NP it has just found. So np(...)
should be the head of the Prolog clause. Ignoring feature
unifications, the clause will be:
np(G,NPf,Gf,S1,S3) :- goal(vp,VPf,S1,S2),
s(G,Sf,Gf,S2,S3).
6 It has been suggested that a combination of GULP
and BUP should be known as BURP. This suggestion has
not been acted upon.
24
That is: "Having just found an NP with features NPf, parse a VP
with features VPf. You will then have completed an S, so look for
a clause that tells you what to do with it."
Here S1, S2, and S3 represent the input string initially,
after parsing the VP, and after completing the S. G is the higher
constituent that was being sought when the NP was found, and Gf
contains its features. If, when the S is completed, it turns out
that an S was being sought (the usual case), then execution can
finish with the terminal rule
s(s,F,F,X,X).
Otherwise another clause for s(...) must be searched for.
Much of the work of BUP is done by the goal-forming
predicate goal, defined thus:
goal(G,Gf,S1,S3) :-
word(W,Wf,S1,S2),
NewGoal =.. [W,G,Wf,Gf,S2,S3],
call(NewGoal).
That is (ignoring features): "To parse a G in input string S1
leaving the remaining input in S3, first accept a word, then
construct a new goal depending on its category (W)." For example,
the query
?- goal(s,Sf,[the,dog,barked],S3).
will first call
?- word(W,Wf,[the,dog,barked],[dog,barked]).
thereby instantiating W to det and Wf to the word's features,
and then construct and call the goal
?- det(s,Wf,Sf,[dog,barked],S3).
That is: "I've just completed a det and am trying to parse an s.
What do I do next?" A rule such as
det, n --> np
(or rather its BUP equivalent) can be invoked next, to accept
another word (a noun) and complete an NP.
5. Comparison with other systems
5.1. GULP versus PATR-II
25
PATR-II (Shieber 1986a, b) is the most widely used software
tool for implementing unification-based grammars, as well as the
most mature and sophisticated. It differs from GULP in three main
ways:
(1) Whereas GULP is an extension of Prolog, PATR-II is a new self-
contained programming language.
(2) Whereas GULP allows the use of any parsing algorithm, PATR-II
provides one specific parser (left-corner, Earley, or Cocke-
Kasami-Younger, depending on the version).
(3) Whereas GULP grammar rules treat feature structures as data
items, PATR-II grammar rules state equations on feature values.
Of these, (3) makes the biggest practical difference. The
rule which GULP writes as
s(person:X..number:Y) -->
np(person:X..number:Y),
vp(person:X..number:Y).
(assuming use of the DCG parser) is rendered in PATR-II as:
Rule S --> NP VP:
<S number> = <NP number>
<S number> = <VP number>
<S person> = <NP person>
<S person> = <VP person>.
or the like. Paths are permitted, of course; one could write <NP
syn agr number> to build a more complex structure.
Here S, NP, and VP are not pure variables; the equations
<S cat> = s
<NP cat> = np
<VP cat> = vp
(or the equivalent) are implicit. Further abbreviatory power comes
from templates, which are predefined sets of features and values.
Thus, instead of writing the lexical entry
Word sleeps: <cat> = v
<person> = third
<number> = singular
<subcat> = intransitive.
the PATR-II programmer can define the template
26
Let ThirdSingVerb be <cat> = v
<person> = third
<number> = singular.
and then write:
Word sleeps: ThirdSingVerb
<subcat> = intransitive.
Word chases: ThirdSingVerb
<subcat> = transitive.
The GULP equivalent of a template is a Prolog fact such as:
thirdsingverb(person:third..number:singular).
Lexical entries can then use this as an abbreviatory device:
v(Vf) --> [sleeps], { thirdsingverb(Vf),
Vf = subcat:intransitive }.
v(Vf) --> [chases], { thirdsingverb(Vf),
Vf = subcat:transitive }.
(There is no cat:v here because in the DCG parser, categories are
functors rather than feature values.)
Unlike GULP, PATR-II provides for default inheritance. That
is, the programmer can invoke a template and then change some of
the values that it supplies, thus:
Word does: ThirdSingVerb
<cat> = auxverb.
This means: "Does is a ThirdSingVerb except that its category is
not v but rather auxverb." PATR-II also provides for lexical
redundancy rules that transform one lexical entry into another,
e.g., building a passive verb from every active verb.
Both of these capabilities are absent from GULP per se, but
they could be built into a parser written in GULP. Indeed, many
contrasts between GULP and PATR-II reflect the fact that PATR-II
is a custom-built environment for implementing grammars that fit a
particular mold, while GULP is a minimal extension to a much more
general-purpose programming language.
One advantage of GULP is that the full range of Prolog data
structures is available. Shieber (1986a:28-32) equips each verb
with an ordered list of NPs that are its syntactic arguments
(subject, object, etc.). But there are no lists in PATR-II, so
Shieber has to construct them as nested feature structures:
27
<20> <20>
<20> first: ...first element... <20>
<20> <20> <20> <20>
<20> rest: <20> first: ...second element... <20> <20>
<20> <20> <20> <20> <20> <20>
<20> <20> rest: <20> first: ...third element... <20> <20> <20>
<20> <20> <20> <20> <20> <20> <20> <20>
<20> <20> <20> rest: <20> first: ...fourth element... <20> <20> <20> <20>
<20> <20> <20> <20> rest: end <20> <20> <20> <20>
<20> <20> <20> <20> <20> <20> <20> <20>
This may be desirable on grounds of theoretical parsimony, but it
is notationally awkward. In GULP, one can simply write
[X1,X2,X3,X4], where X1, X2, X3, and X4 are variables,
constants, feature structures, or terms of any other kind.
5.2. GULP versus PrAtt
PrAtt (Prolog with Attributes), described briefly by Johnson
and Klein (1986), is a PATR-like extension of Prolog. In PrAtt,
feature structure equations are treated as Prolog goals. An
example is the DCG rule:
s(Sf) --> np(NPf), vp(VPf),
{ Sf:number = NPf:number,
Sf:number = VPf:number,
Sf:person = NPf:person,
Sf:person = VPf:person }.
This looks almost like GULP syntax, but the meaning is different.
NPf:number is not a Prolog term, but rather an evaluable
expression; at execution time, it is replaced by the number
element of structure NPf.
Compared to GULP, PrAtt makes a much bigger change to the
semantics of Prolog. GULP merely translates data into data
(changing the format from feature structures to value lists), but
PrAtt translates data into extra operations.
An example will make this clearer. In order to execute
Sf:number = NPf:number, PrAtt must extract the number
features of Sf and NPf, then unify them. In Johnson and Klein's
implementation, this extraction is done at run time; that is, on
find the expression Sf:number, the PrAtt interpreter looks at
the contents of Sf, and then replaces Sf:number with the value
of Sf's number feature.
This implies that the value of Sf is known. If it is not --
for example, if the PrAtt-to-Prolog translation is being performed
before running the program -- then extra goals must be inserted
28
into the program to extract the appropriate feature values. The
single PrAtt goal Sf:number = NPf:number becomes at least
three goals:
(1) Unify Sf with something that will put the number value
into a unique variable (call it X).
(2) Unify NPf with something that will put its number value
into a unique variable (call it Y).
(3) Unify X and Y.
To put this another way, whereas GULP modifies the syntax
for Prolog terms, PrAtt modifies the unification algorithm, using
three calls to the existing Prolog unification algorithm to
perform one PrAtt unification.
5.3. GULP versus AVAG
AVAG (Attribute-Value Grammar, Sedogbo 1986) is an
implementation of generalized phrase structure grammar (GPSG), a
framework for expressing linguistic analyses. A three-pass
compiler translates AVAG notation into Prolog II. As such, AVAG is
far more complex than GULP or PrAtt, and there is little point in
making a direct comparison. Comparing AVAG to PATR-II would be
instructive but is outside the scope of this paper.
AVAG is interesting because it uses the Prolog II built-in
predicates dif (which means "this variable must never be
instantiated to this value") and freeze ("wait until these
variables are instantiated, then test them") to implement
negative-valued and set-valued features respectively. For
example, the rule
voit:
<cat> = verb
<person> /= 2
<number> = sing.
uses dif to ensure that the person feature never equals 2, and
chaque:
<cat> = art
<gender> = [mas,fem]
<number> = sing.
uses freeze to ensure that when the gender feature becomes
instantiated, its value is mas or fem. There are no direct
equivalents for dif or freeze in conventional (Edinburgh)
29
Prolog; they could be implemented only by changing the inference
engine.
5.4. GULP versus STUF
STUF (Stuttgart Formalism) is a formal language for
describing unification-based grammars. It is more comprehensive
than PATR-II and as yet is only partly implemented (Bouma et al.
1988).
Comparing STUF to GULP would be rather like comparing linear
algebra to Fortran; the systems are not in the same category.
Nonetheless, STUF introduces a number of novel ideas that could be
exploited in parsers or other systems written in GULP.
The biggest of these is nondestructive unification. In
Prolog, unification is a destructive operation; terms that are
being unified are replaced by their unifier. For example, if X =
[a,_] and Y = [_,b], then after unifying X and Y, X = Y =
[a,b]. In STUF, on the other hand, an expression such as
z = (x y)
creates a third structure z whose value is the unifier of x and
y; x and y themselves are unaffected. Nondestructive unification
can be implemented in Prolog by copying the terms before unifying
them (Covington et al. 1988:204).
Further, if the unification fails, z gets the special value
FAIL. If x and y are feature structures, and parts of them are
unifiable but other parts are not, the non-unifiable parts will be
represented by FAIL in the corresponding parts of z. This
provides a way to implement negative-valued features. For example,
to ensure that a verb is not third person singular, one can
stipulate that when its person feature is unified with
person:3, the result is FAIL.
In STUF, a feature can also have a set of alternatives as
its value, and when two structures containing such sets are
unified, the unifier is the set of all the unifiers of structures
that would result from choosing different alternatives.
Finally, STUF exploits the fact that grammar rules can
themselves be treated as feature structures. For example, the rule
30
S --> NP VP
<20> <20> <20> <20>
<20> person:X <20> <20> person:X <20>
<20> number:Y <20> <20> number:Y <20>
<20> <20> <20> <20>
(or more precisely the tree structure that it sanctions) can be
expressed as the feature structure
<20> <20> <20> <20>
<20> mother: <20> category: s <20> <20>
<20> <20> <20> <20>
<20> <20> <20> <20>
<20> daughter_1: <20> category: np <20> <20>
<20> <20> person: X <20> <20>
<20> <20> number: Y <20> <20>
<20> <20> <20> <20>
<20> <20> <20> <20>
<20> daughter_2: <20> category: vp <20> <20>
<20> <20> person: X <20> <20>
<20> <20> number: Y <20> <20>
<20> <20> <20> <20>
STUF therefore implements grammar rules via "graph
application," an operation that treats one feature structure
(directed acyclic graph) as a function to be applied to another.
Graph application is an operation with four arguments: (1) a
graph expressing the function; (2) a graph to be treated as the
argument; (3) a path indicating what part of the argument graph is
to be unified with the function graph; and (4) a path indicating
what part of the argument graph should be unified (destructively)
with the result of the first unification. The ordinary GULP
practice of simply unifying one graph with another is a special
case of this.
6.0. Future Prospects
6.1. Possible improvements
One disadvantage of GULP is that every feature structure
must contain a position for every feature in the grammar. This
makes feature structures larger and slower to process than they
need be. By design, unused features often fall in the
uninstantiated tail of the value list, and hence take up neither
time nor space. But not all unused features have this good
fortune. In practice, almost every value list contains gaps, i.e.,
positions that will never be instantiated, but must be passed over
in every unification.
31
To reduce the number of gaps, GULP could be modified to
distinguish different types of value lists. The feature structure
for a verb needs a feature for tense; the feature structure for a
noun does not. Value lists of different types would reserve the
same positions for different features, skipping features that
would never be used. Some kind of type marker, such as a unique
functor, would be needed so that value lists of different types
would not unify with each other.
Types of feature structures could be distinguished by the
programmer -- e.g., by giving alternative g_features
declarations -- or by modifying the GULP translator itself to look
for patterns in the use of features.
6.2. Keyword parameters via GULP
Unification-based grammar is not the only use for GULP.
Feature structures are a good formalization of keyword-value
argument lists.
Imagine a complicated graphics procedure that takes
arguments indicating desired window size, maximum and minimum
coordinates, and colors, all of which have default values. In
Pascal, the procedure can only be called with explicit values for
all the parameters:
OpenGraphics(480,640,-240,240,-320,320,green,black);
There could, however, be a convention that 0 means "take the
default:"
OpenGraphics(0,0,0,0,0,0,red,blue);
Prolog can do slightly better by using uninstantiated arguments
where defaults are wanted, and thereby distinguishing "default"
from "zero":
:- open_graphics(_,_,_,_,_,_,red,blue).
In GULP, however, the argument of open_graphics can be a
feature structure in which the programmer mentions only the non-
default items:
:- open_graphics( foreground:red..background:blue ).
In this feature structure, the values for x_resolution,
y_resolution, x_maximum, x_minimum, y_maximum, and
y_minimum (or whatever they are called) are left uninstantiated
because they are not mentioned. So in addition to facilitating the
32
implementation of unification-based grammars, GULP provides Prolog
with a keyword argument system.
References
Barton, G. Edward; Berwick, Robert C.; and Ristad, Eric Sven.
1987. Computational complexity and natural language.
Cambridge, Massachusetts: MIT Press.
Bouma, Gosse; K<>nig, Esther; and Uszkoreit, Hans. 1988. A flexible
graph-unification formalism and its application to natural-
language processing. IBM Journal of Research and Development
32:170-184.
Bresnan, Joan, ed. 1982. The mental representation of grammatical
relations. Cambridge, Massachusetts: MIT Press.
Chomsky, Noam. 1957. Syntactic structures. (Janua linguarum, 4.)
The Hague: Mouton.
Chomsky, Noam. 1965. Aspects of the theory of syntax. Cambridge,
Massachusetts: MIT Press.
Covington, Michael A. 1987. GULP 1.1: an extension of Prolog for
unification-based grammar. ACMC Research Report 01-0021.
Advanced Computational Methods Center, University of
Georgia.
Covington, Michael A.; Nute, Donald; and Vellino, Andr<64>. 1988.
Prolog programming in depth. Glenview, Ill.: Scott,
Foresman.
Gazdar, Gerald; Klein, Ewan; Pullum, Geoffrey; and Sag, Ivan.
Generalized phrase structure grammar. Cambridge,
Massachusetts: Harvard University Press.
Giannesini, Francis; Kanoui, Henry; Pasero, Robert; and van
Caneghem, Michel. 1986. Prolog. Wokingham, England: Addison-
Wesley.
Johnson, Mark, and Klein, Ewan. 1985. A declarative formulation of
Discourse Representation Theory. Paper presented at the
summer meeting of the Association for Symbolic Logic, July
15-20, 1985, Stanford University.
Johnson, Mark, and Klein, Ewan. 1986. Discourse, anaphora, and
parsing. Report No. CSLI-86-63. Center for the Study of
Language and Information, Stanford University. Also in
Proceedings of Coling86 669-675.
33
Joshi, Aravind K. 1986. The convergence of mildly context-
sensitive grammar formalisms. Draft distributed at Stanford
University, 1987.
Kamp, Hans. 1981. A theory of truth and semantic representation.
Reprinted in Groenendijk, J.; Janssen, T. M. V.; and
Stokhof, M., eds., Truth, interpretation, and information.
Dordrecht: Foris, 1984.
Kaplan, Ronald M., and Bresnan, Joan. 1982. Lexical-Functional
Grammar: a formal system for grammatical representation.
Bresnan 1982:173-281.
Karttunen, Lauri. 1986a. D-PATR: a development environment for
unification-based grammars. Report No. CSLI-86-61. Center
for the Study of Language and Information, Stanford
University. Shortened version in Proceedings of Coling86 74-
80.
Karttunen, Lauri. 1986b. Features and values. Shieber et al. 1986
(vol. 1), 17-36. Also in Proceedings of Coling84 28-33.
Matsumoto, Yuji; Tanaka, Hozumi; and Kiyono, Masaki. 1986. BUP: a
bottom-up parsing system for natural languages. Michel van
Caneghem and David Warren, eds., Logic programming and its
applications 262-275. Norwood, N.J.: Ablex.
Pollard, Carl, and Sag, Ivan A. 1987. Information-based syntax and
semantics, vol. 1: Fundamentals. (CSLI Lecture Notes, 13.)
Center for the Study of Language and Information, Stanford
University.
Sedogbo, Celestin. 1986. AVAG: an attribute/value grammar tool.
FNS-Bericht 86-10. Seminar f<>r nat<61>rlich-sprachliche
Systeme, Universit<69>t T<>bingen.
Shieber, Stuart M. 1986a. An introduction to unification-based
approaches to grammar. (CSLI Lecture Notes, 4.) Center for
the Study of Language and Information, Stanford University.
Shieber, Stuart M. 1986b. The design of a computer language for
linguistic information. Shieber et al. (eds.) 1986 (vol. 1)
4-26.
Shieber, Stuart M.; Pereira, Fernando C. N.; Karttunen, Lauri; and
Kay, Martin, eds. A compilation of papers on unification-
based grammar formalisms. 2 vols. bound as one. Report No.
CSLI-86-48. Center for the Study of Language and
Information, Stanford University.
34
Spencer-Smith, Richard. 1987. Semantics and discourse
representation. Mind and Language 2.1: 1-26.
Appendix. GULP 2.0 User's Guide
A.1 Overview
GULP is a laboratory instrument, not a commercial product.
Although reasonably easy to use, it lacks the panache and
sophistication of Turbo Pascal or Arity Prolog 5.0. The emphasis
is on getting the job done as simply as possible.
The final word on how GULP works is contained in the file
GULP.ARI or GULP.PL, which you should consult whenever you have a
question that is not answered here.
A.2 Installation and access
On the VAX, GULP is already installed, and you reach it by
the command
$ gulp
This puts you into a conventional Prolog environment (note: not a
Quintus Prolog split-screen environment) in which the GULP built-
in predicates are available.
The IBM PC version of GULP is supplied as a modified copy of
Arity Prolog Interpreter 4.0. It is for use only on machines for
which Arity Prolog is licensed and is not for distribution outside
the AI Research Group.
Many of the GULP file names are the same as files used by
the unmodified Arity Prolog Interpreter. It is therefore important
that GULP be installed in a different directory.
To run GULP you also need a full-screen editor that is
accessible by the command:
edit filename
GULP passes commands of this form to DOS when invoking the editor.
We usually use AHED.COM, renamed EDIT.COM, for the purpose, but
you can use any editor that produces ASCII files.
A.3 How to run programs
35
GULP is simply a version of Prolog with more built-in
predicates added. All the functions and features of Prolog7 are
still available and work exactly as before. GULP is used in
exactly the same way as Prolog except that:
(1) Programs containing feature structures must be loaded
via the built-in predicate load, not consult or reconsult. The
reason is that consult or reconsult would load the feature
structures into memory without converting them into value lists.
Prolog will do this without complaining, but GULP programs will
not work. Never use consult or reconsult to load anything that
contains GULP feature structures.
(2) You must always invoke the editor with the GULP command
ed, not with whatever command you would use in Prolog. This is
important because your ordinary editing command would
automatically invoke reconsult after editing; ed invokes load
instead.
(3) You cannot use feature structure notation in a query
because queries do not go through the translator. Write your
program so that you can invoke all the necessary predicates
without having to type feature structures on the command line.
A.4 Built-in predicates usually used as commands
?- load filename.
Loads the program on file filename into memory via the GULP
translator. load is like reconsult in that, whenever a
predicate is encountered that is already defined, the old
definition is discarded before the new definition is loaded. This
means that all clauses defining a predicate must be on the same
file, but they need not be contiguous. Further, you can load
definitions of different predicates from different files without
conflict. Further, you can embed a call to load in a program to
make it load another program.
If the file name is not in quotes, the ending .GLP is added. If
the file name contains a period, it must be typed in quotes
(single or double).
7 Except the module system, which uses the colon
(':') for its own purposes and conflicts with GULP
syntax.
36
?- load.
Loads (again) the file that was used in the most recent call to
load or ed.
?- ed filename.
Calls the editor to process file filename, then loads that file.
?- ed.
Edits and loads (again) the file that was used in the most recent
call to load or ed.
?- list P/N.
Lists all clauses that define the predicate P with N arguments,
provided these clauses were loaded with ed or load. (Note: In
the case of grammar rules, the number of arguments includes the
arguments automatically supplied by the Prolog grammar rule
translator.)
?- list P.
Lists all clauses that define the predicate P with any number of
arguments, provided these clauses were loaded with ed or load.
?- list.
Lists all clauses that were loaded with ed or load.
?- new.
Clears the workspace; removes from memory all clauses that were
loaded with ed or load. (Does not delete clauses that were
placed into memory by consult, reconsult, or assert.)
A.5 Built-in predicates usually used within the program
37
g_translate(FeatureStructure,ValueList)
Translates a feature structure into a value list, or vice versa.
Used when you must interconvert internal and external
representations at run time (e.g., to input or output them). For
example, the following will accept a feature structure in GULP
notation from the keyboard, translate it into a value list, and
pass the value list to your predicate test:
?- read(X), g_translate(X,Y), test(Y).
The following translates a feature structure X from internal to
external representation and prints it out:
... g_translate(Y,X), write(Y).
display_feature_structure(X)
Displays X in a convenient indented notation (not GULP syntax),
where X is either a feature structure or a value list.
g_display(X)
Equivalent to display_feature_structure(X); retained for
compatibility with GULP 1.
g_printlength(A,N)
Where A is an atom, instantiates N to the number of characters
needed to print it. Useful in constructing your own indented
output routines.
writeln(X)
If X is a list, writes each element of X on a separate line and
then begins a new line. If X is not a list, writes X and then
begins a new line. Examples:
writeln('This is a message.').
writeln(['This is a','two-line message.']).
Lists within lists are not processed recursively. The elements of
the outermost list are printed one per line, and lists within it
are printed as lists.
38
append(List1,List2,List3)
Concatenates List1 and List2 giving List3, or splits List3
into List1 and List2.
member(Element,List)
Succeeds if Element is an element of List. If Element is
uninstantiated, it will be instantiated, upon backtracking, to
each successive element of List.
remove_duplicates(List1,List2)
Removes duplicate elements from List1 giving List2.
retractall(P)
Retracts (abolishes) all clauses whose predicate is P.
phrase(Constituent,InputString)
Provides a simplified way to call a parser written with DCG rules;
for example, the goal ?- phrase(s,[the,dog,barks]) is
equivalent to ?- s([the,dog,barks],[]).
copy(X,Y)
Copies term X, giving term Y. These terms are the same except
that all uninstantiated variables in X are replaced by new
uninstantiated variables in Y, arranged in the same way.
Variables in Y can then be instantiated without affecting X.
call_if_possible(Goal)
Executes Goal, or fails without an error message if there are no
clauses for Goal. (In Quintus Prolog, the program crashes with an
error message if there is an attempt to query a goal for which
there are no clauses.)
g_fs(X)
Succeeds if X is an untranslated feature structure, i.e., a term
whose principal functor is ':', '..', or '::'.
39
g_not_fs(X)
Succeeds if X is not an untranslated feature structure.
g_vl(X)
Succeeds if X is a value list (the internal representation of a
feature structure).
A.6 Other built-in predicates
g_ed_command(X)
Instantiates X to the command presently used to call the editor.
To call a different editor, assertz your own clause for this
predicate (e.g., 'g_ed_command(emacs)').
g_herald(X)
Instantiates X to an atom identifying the current version of
GULP.
A.7 Differences between GULP 1.1 and 2.0
(1) g_features declarations are no longer required, but are
still permitted, and, if used, need not be complete.
(2) The operator '..' is now preferred in place of '::'. However,
the older form can still be used.
(3) There have been minor changes in the operator precedence of
':' and '..' relative to other operators. This is extremely
unlikely to cause problems unless you have written feature
structures that contain other operators such as '+' or '-'.
(4) GULP 2.0 distinguishes between features that are mentioned but
uninstantiated, and features that are never mentioned. Previously,
g_display never printed out any uninstantiated features.
(5) Bugs have been corrected. Translation of value lists to
feature structures works correctly.
(6) Some rarely used built-in predicates have been deleted. In all
cases these predicates had more common synonyms (ed rather than
g_ed, list rather than g_list, etc.).
40
(7) list translates feature structures into GULP notation before
displaying them. (A debugger with the same capability is foreseen
in the future.)
(8) Nested loads are now supported. That is, a file being loaded
can contain a directive such as ':- load file2.' which will be
executed correctly.
41
Figure 1. A syntactic tree (based on Chomsky 1957).
S
NP VP
D N V
The dog barks.
Figure 2. The same tree with features added.
S
NP VP
<20> <20> <20> <20>
<20> num:singular <20> <20> num:singular <20>
<20> <20> <20> pers:3rd <20>
<20> tense:present <20>
<20> <20>
D N V
<20> <20> <20> <20>
<20> num:singular <20> <20> num:singular <20>
<20> <20> <20> pers:3rd <20>
<20> tense:present <20>
<20> <20>
The dog barks.
42
Figure 3. An example of a unification-based grammar.
[a] S --> NP VP
<20> <20> <20> <20> <20> <20> <20> <20> <20> <20>
<20> sem:<3A> pred:X <20> <20> <20> sem:Y <20> <20> sem:<3A> pred:X <20> <20>
<20> <20> arg1:Y <20> <20> <20> case:nom <20> <20> <20> arg2:Z <20> <20>
<20> <20> arg2:Z <20> <20> <20> <20> <20> <20> <20> <20>
<20> <20> <20> <20>
[b] VP --> V NP
<20> <20> <20> <20> <20> <20> <20> <20>
<20> sem:<3A> pred:X1 <20> <20> <20> sem:X1 <20> <20> sem:Y1 <20>
<20> <20> arg2:Y1 <20> <20> <20> <20> <20> case:acc <20>
<20> <20> <20> <20> <20> <20>
[c] V --> sees
[sem:SEES]
[d] NP --> Max
[sem:MAX]
[e] NP --> Bill
[sem:BILL]
[f] NP --> me
<20> <20>
<20> sem:ME <20>
<20> case:acc <20>
<20> <20>
43
Figure 4. Bottom-up parsing of Max sees Bill.
a. Rules [c], [d], and [e] supply features for individual words:
NP V NP
[sem:MAX] [sem:SEES] [sem:BILL]
<20> <20> <20>
Max sees Bill
b. Rule [b] allows V and NP to be grouped into a VP:
VP
<20> <20> <20> <20>
<20> sem:<3A> pred:SEES <20> <20>
<20> <20> arg2:BILL <20> <20>
<20> <20> <20> <20>
<20>
<20> <20>
NP V NP
<20> <20> <20> <20> <20> <20>
<20> sem:MAX <20> <20> sem:SEES <20> <20> sem:BILL <20>
<20> <20> <20> <20> <20> case:acc <20>
<20> <20> <20> <20>
<20> <20> <20>
Max sees Bill
44
c. Rule [a] allows NP and VP to be grouped into an S:
S
<20> <20> <20> <20>
<20> sem:<3A> pred:SEES <20> <20>
<20> <20> arg1:MAX <20> <20>
<20> <20> arg2:BILL <20> <20>
<20> <20> <20> <20>
<20>
<20>
VP
<20> <20> <20> <20>
<20> sem:<3A> pred:SEES <20> <20>
<20> <20> arg2:BILL <20> <20>
<20> <20> <20> <20>
<20>
<20> <20> <20>
NP V NP
<20> <20> <20> <20> <20> <20>
<20> sem:MAX <20> <20> sem:SEES <20> <20> sem:BILL <20>
<20> case:nom <20> <20> <20> <20> case:acc <20>
<20> <20> <20> <20> <20>
<20> <20> <20>
Max sees Bill
45
Figure 5. DAG representations of feature structures.
.
<20> <20>
<20> a:b <20>
<20> c:b <20> = a c e
<20> e:d <20>
<20> <20>
b d
.
<20> <20> <20> <20> syn sem
<20> syn:<3A> case:acc <20> <20>
<20> <20> gender:masc <20> <20> =
<20> <20> <20> <20>
<20> sem:MAN <20> .
<20> <20>
case gender MAN
acc masc
46
Figure 6. Parse tree for Max sees me. The ungrammatical sentence
Me sees Max is ruled out by a feature conflict.
S
<20> <20> <20> <20>
<20> sem:<3A> pred:SEES <20> <20>
<20> <20> arg1:MAX <20> <20>
<20> <20> arg2:ME <20> <20>
<20> <20> <20> <20>
<20>
<20>
VP
<20> <20> <20> <20>
<20> sem:<3A> pred:SEES <20> <20>
<20> <20> arg2:ME <20> <20>
<20> <20> <20> <20>
<20>
<20> <20> <20>
NP V NP
<20> <20> <20> <20> <20> <20>
<20> sem:MAX <20> <20> sem:SEES <20> <20> sem:ME <20>
<20> case:nom <20> <20> <20> <20> case:acc <20>
<20> <20> <20> <20> <20>
<20> <20> <20>
Max sees me
47
Figure 7.
% GULP example 1.
% Grammar from Figure 3, in DCG notation, with GULP feature structures.
s(sem: (pred:X .. arg1:Y .. arg2:Z)) --> np(sem:Y .. case:nom),
vp(sem: (pred:X .. arg2:Z)).
vp(sem: (pred:X1 .. arg2:Y1)) --> v(sem:X1),
np(sem:Y1).
v(sem:'SEES') --> [sees].
np(sem:'MAX') --> [max].
np(sem:'BILL') --> [bill].
np(sem:'ME' .. case:acc) --> [me].
% Procedure to parse a sentence and display its features
try(String) :- writeln([String]),
phrase(s(Features),String),
display_feature_structure(Features).
% Example sentences
test1 :- try([max,sees,bill]).
test2 :- try([max,sees,me]).
test3 :- try([me,sees,max]). /* should fail */
48
Figure 8.
% Same as GULP example 1, but written in a much more PATR-like style,
% treating the unifications as separate operations.
s(Sfeatures) --> np(NPfeatures), vp(VPfeatures),
{ Sfeatures = sem: (pred:X .. arg1:Y .. arg2:Z),
NPfeatures = sem:Y .. case:nom,
VPfeatures = sem: (pred:X .. arg2:Z) }.
vp(VPfeatures) --> v(Vfeatures), np(NPfeatures),
{ VPfeatures = sem: (pred:X1 .. arg2:Y1),
Vfeatures = sem:X1,
NPfeatures = sem:Y1 }.
v(Features) --> [sees], { Features = sem:'SEES' }.
np(Features) --> [max], { Features = sem:'MAX' }.
np(Features) --> [bill], { Features = sem:'BILL' }.
np(Features) --> [me], { Features = sem:'ME' .. case:acc }.
% Procedure to parse a sentence and display its features
try(String) :- writeln([String]),
s(Features,String,[]),
display_feature_structure(Features).
% Example sentences
test1 :- try([max,sees,bill]).
test2 :- try([max,sees,me]).
test3 :- try([me,sees,max]). /* should fail */
49
Figure 9.
% Demonstration of a hold stack that
% picks up the word 'what' at beginning of
% sentence, and carries it along until an
% empty NP position is found
% S may or may not begin with 'what did'.
% In the latter case 'what' is added to the stack
% before the NP and VP are parsed.
s(S) --> np(NP), vp(VP),
{ S = hold: (in:H1..out:H3),
NP = hold: (in:H1..out:H2),
VP = hold: (in:H2..out:H3) }.
s(S) --> [what,did], np(NP), vp(VP),
{ S = hold: (in:H1..out:H3),
NP = hold: (in:[what|H1]..out:H2),
VP = hold: (in:H2..out:H3) }.
% NP is parsed by either accepting det and n,
% leaving the hold stack unchanged, or else
% by extracting 'what' from the stack without
% accepting anything from the input string.
np(NP) --> det, n, { NP = hold: (in:H..out:H) }.
np(NP) --> [], { NP = hold: (in:[what|H1]..out:H1) }.
% VP consists of V followed by NP or S.
% Both hold:in and hold:out are the same
% on the VP as on the S or NP, since the
% hold stack can only be altered while
% processing the S or NP, not the verb.
vp(VP) --> v, np(NP), { VP = hold:H,
NP = hold:H }.
vp(VP) --> v, s(S), { VP = hold:H,
S = hold:H }.
% Lexicon
det --> [the];[a];[an].
n --> [dog];[cat];[boy].
v --> [said];[say];[chase];[chased].
try(X) :- writeln([X]),
S = hold: (in:[]..out:[]),
50
phrase(s(S),X,[]).
test1 :- try([the,boy,said,the,dog,chased,the,cat]).
test2 :- try([what,did,the,boy,say,chased,the,cat]).
test3 :- try([what,did,the,boy,say,the,cat,chased]).
test4 :- try([what,did,the,boy,say,the,dog,chased,the,cat]).
/* test4 should fail */
51
Figure 10.
% Discourse Representation Theory
% (part of the program from Johnson & Klein 1986,
% translated from PrAtt into GULP).
% unique_integer(N)
% instantiates N to a different integer
% every time it is called, thereby generating
% unique indices.
unique_integer(N) :-
retract(unique_aux(N)),
!,
NewN is N+1,
asserta(unique_aux(NewN)).
unique_aux(0).
% Nouns
% Each noun generates a unique index and inserts
% it, along with a condition, into the DRS that
% is passed to it.
n(N) --> [man],
{ unique_integer(C),
N = syn:index:C ..
sem: (in: [Current|Super] ..
out: [[C,man(C)|Current]|Super]) }.
n(N) --> [donkey],
{ unique_integer(C),
N = syn:index:C ..
sem: (in: [Current|Super] ..
out: [[C,donkey(C)|Current]|Super]) }.
% Verbs
% Each verb is linked to the indices of its arguments
% through syntactic features. Using these indices,
% it adds the appropriate predicate to the semantics.
v(V) --> [saw],
{ V = syn: (arg1:Arg1 .. arg2:Arg2) ..
sem: (in: [Current|Super] ..
out: [[saw(Arg1,Arg2)|Current]|Super]) }.
52
% Determiners
% Determiners tie together the semantics of their
% scope and restrictor. The simplest determiner,
% 'a', simply passes semantic material to its
% restrictor and then to its scope. A more complex
% determiner such as 'every' passes an empty list
% to its scope and restrictor, collects whatever
% semantic material they add, and then arranges
% it into an if-then structure.
det(Det) --> [a],
{ Det = sem:res:in:A, Det = sem:in:A,
Det = sem:scope:in:B, Det = sem:res:out:B,
Det = sem:out:C, Det = sem:scope:out:C }.
det(Det) --> [every],
{ Det = sem:res:in:[[]|A], Det = sem:in:A,
Det = sem:scope:in:[[]|B], Det = sem:res:out:B,
Det = sem:scope:out:[Scope,Res|[Current|Super]],
Det = sem:out:[[Res>Scope|Current]|Super] }.
% Noun phrase
% Pass semantic material to the determiner, which
% will specify the logical structure.
np(NP) --> { NP=sem:A, Det=sem:A,
Det=sem:res:B, N=sem:B,
NP=syn:C, N=syn:C }, det(Det),n(N).
% Verb phrase
% Pass semantic material to the embedded NP
% (the direct object).
vp(VP) --> { VP = sem:A, NP = sem:A,
NP = sem:scope:B, V = sem:B,
VP = syn:arg2:C, NP = syn:index:C,
VP = syn:D, V = syn:D }, v(V), np(NP).
% Sentence
% Pass semantic material to the subject NP.
% Pass VP semantics to the subject NP as its scope.
s(S) --> { S = sem:A, NP = sem:A,
S = syn:B, VP = syn:B,
NP = sem:scope:C, VP = sem:C,
VP = syn:arg1:D, NP = syn:index:D }, np(NP), vp(VP).
% Procedure to parse and display a sentence
53
try(String) :- write(String),nl,
Features = sem:in:[[]], /* start w. empty structure */
phrase(s(Features),String),
Features = sem:out:SemOut, /* extract what was built */
display_feature_structure(SemOut).
% Example sentences
test1 :- try([a,man,saw,a,donkey]).
test2 :- try([a,donkey,saw,a,man]).
test3 :- try([every,man,saw,a,donkey]).
test4 :- try([every,man,saw,every,donkey]).
54
Figure 11.
% BUP in GULP:
% Bottom-up parsing algorithm of Matsumoto et al. (1986)
% with the grammar from Figure 3.
% Goal-forming clause
goal(G,Gf,S1,S3) :-
word(W,Wf,S1,S2),
NewGoal =.. [W,G,Wf,Gf,S2,S3],
call(NewGoal).
% Terminal clauses for nonterminal symbols
s(s,F,F,X,X).
vp(vp,F,F,X,X).
np(np,F,F,X,X).
% Phrase-structure rules
% np vp --> s
np(G,NPf,Gf,S1,S3) :- goal(vp,VPf,S1,S2),
s(G,Sf,Gf,S2,S3),
NPf = sem:Y..case:nom,
VPf = sem: (pred:X..arg2:Z),
Sf = sem: (pred:X..arg1:Y..arg2:Z).
% v np --> vp
v(G,Vf,Gf,S1,S3) :- goal(np,NPf,S1,S2),
vp(G,VPf,Gf,S2,S3),
Vf = sem:X1,
NPf = sem:Y1..case:acc,
VPf = sem: (pred:X1..arg2:Y1).
% Terminal symbols
word(v,sem:'SEES',[sees|X],X).
word(np,sem:'MAX',[max|X],X).
word(np,sem:'BILL',[bill|X],X).
word(np,sem:'ME'..case:acc,[me|X],X).
% Procedure to parse a sentence and display its features
55
try(String) :- writeln([String]),
goal(s,Features,String,[]),
display_feature_structure(Features).
% Example sentences
test1 :- try([max,sees,bill]).
test2 :- try([max,sees,me]).
test3 :- try([me,sees,max]). /* should fail */