textfiles/programming/AI/ai198901.txt



                              Research Report AI-1989-01

                           Artificial Intelligence Programs

                              The University of Georgia

                                Athens, Georgia 30602


                                         Available by ftp from

                                           aisun1.ai.uga.edu

                                            (128.192.12.9)


                                            Series editor: 

                                           Michael Covington

                                      mcovingt@aisun1.ai.uga.edu


                                    GULP 2.0: An Extension of

                         Prolog for Unification-Based Grammar


                                 Michael A. Covington


                            Advanced Computational Methods Center

                                         University of Georgia

                                         Athens, Georgia 30602


                                             January 1989


                        ABSTRACT:  A  simple extension  to  Prolog facilitates

                        implementation of unification-based grammars (UBGs) by

                        adding a new notational device, the feature structure,

                        whose   behavior   emulates  graph   unification.  For

                        example,  a:b..c:d denotes  a  feature  structure in

                        which a has the value  b, c has the value d,  and the

                        values  of  all  other  features  are  unspecified.  A

                        modified   Prolog   interpreter   translates   feature

                        structures into Prolog terms that unify in the desired

                        way.   Thus,  the   extension  is   purely  syntactic,

                        analogous to  the automatic translation  of "abc"  to

                        [97,98,99] in Edinburgh Prolog. 

                              The  extended language  is known as  GULP (Graph

                        Unification Logic Programming); it is  as powerful and

                        concise as PATR-II (Shieber 1986a,b) and other grammar

                        development tools, while retaining all the versatility

                        of Prolog. GULP can be used with grammar rule notation

                        (DCGs) or  any other parser that  the programmer cares

                        to implement. 

                              Besides its uses in natural language processing,

                        GULP provides a way to supply keyword arguments to any

                        procedure.


                  1. Introduction


                        A  number   of  software  tools  have   been  developed  for

                  implementing  unification-based  grammars,   among  them   PATR-II

                  (Shieber  1986a,b), D-PATR (Karttunen  1986a), PrAtt  (Johnson and

                  Klein  1986),  and AVAG  (Sedogbo  1986). This  paper  describes a

                  simple  extension to  the syntax  of Prolog  that serves  the same

                  purpose while making a  much less radical change to  the language.

                  Unlike  PATR-II and  similar systems,  this system  treats feature

                  structures as  first-class objects that appear in any context, not

                  just in  equations.1 Further, feature  structures can  be used not

                                    
                       1 The first version of GULP (Covington 1987) was

               developed with support from National Science Foundation

               Grant IST-85-02477. I want to thank Franz Guenthner,

               Rainer B<>uerle, and the other researchers at the


                                                                                    2


                  only  in natural  language processing,  but also  to pass  keyword

                  arguments to any procedure.


                        The  extension is  known  as GULP  (Graph Unification  Logic

                  Programming).  It allows  the  programmer to  write  a:b..c:d to

                  stand for a feature structure in which feature a has  the value b,

                  feature  c   has  the  value  d,   and  all   other  features  are

                  uninstantiated.2  The  interpreter  translates  feature structures

                  written  in this notation into ordinary Prolog terms that unify in

                  the  desired way.  Thus, this  extension is  similar in  spirit to

                  syntactic devices already in the language,  such as writing "abc"

                  for [97,98,99] or writing [a,b,c] for .(a,.(b,.(c,nil))).


                        GULP can be used with grammar rule notation (definite clause

                  grammars,  DCGs) or with any  parser that the  programmer cares to

                  implement in Prolog.


                  2. What is unification-based grammar?


                  2.1. Unification-based theories


                        Unification-based  grammar (UBG)  comprises all  theories of

                  grammar in which unification (merging) of feature structures plays

                  a prominent  role. As such,  UBG is  not a theory  of grammar  but

                  rather  a formalism in which theories of grammar can be expressed.

                  Such  theories include  functional  unification grammar,  lexical-

                  functional grammar  (Kaplan and Bresnan  1982), generalized phrase

                  structure  grammar  (Gazdar  et   al.  1986),  head-driven  phrase

                  structure grammar (Pollard and Sag 1987), and others.


                        UBGs use context-free grammar rules in which the nonterminal

                  symbols  are accompanied  by  sets of  features.  The addition  of

                  features  increases the  power of  the grammar  so that  it  is no

                  longer context-free; indeed, in the worst case, parsing with  such

                  a grammar can be NP-complete (Barton, Berwick, and Ristad 1987:93-

                  96).


                        However,  in  practice,  these intractable  cases  are rare.

                  Theorists  restrain their use of features so that the grammars, if

                  not  actually  context-free, are  close  to  it, and  context-free


               Seminar f<>r nat<61>rlich-sprachliche Systeme, University

               of T<>bingen, for their hospitality and for helpful

               discussions. The opinions and conclusions expressed

               here are solely those of the author.


                    2 This use of the colon makes the Quintus Prolog

               and Arity Prolog module systems unavailable; so far,

               this has not caused problems.


                                                                                    3


                  parsing techniques are successful  and efficient. Joshi (1986) has

                  described this class of grammars as "mildly context-sensitive."


                  2.2. Grammatical features


                        Grammarians have observed since ancient times that each word

                  in a sentence has a set of attributes, or features, that determine

                  its function and restrict its usage. Thus:


                         The                     dog                 barks.


               category:determiner      category:noun         category:verb

                                        number:singular       number:singular

                                                              person:3rd

                                                              tense:present


                  The  earliest generative  grammars  of Chomsky  (1957) and  others

                  ignored  all   of  these  features  except   category,  generating

                  sentences with context-free phrase-structure rules such as


                        sentence --> noun phrase + verb phrase


                    noun phrase --> determiner + noun


                  plus transformational rules  that rearranged syntactic  structure.

                  Syntactic structure  was described  by tree diagrams  (Figure 1).3

                  Number  and tense markers were treated as separate elements of the

                  string (e.g.,  boys = boy +  s). "Subcategorization" distinctions,

                  such as the  fact that some verbs take objects  and other verbs do

                  not,  were handled by splitting  a single category,  such as verb,

                  into two (verbtransitive and verbintransitive).


                        But  complex, cross-cutting combinations  of features cannot

                  be handled in  this way,  and Chomsky  (1965) eventually  attached

                  feature bundles  to all  the nodes  in the  tree  (Figure 2).  His

                  contemporaries  accounted  for  grammatical  agreement  (e.g., the

                  agreement of the number features of subject  and verb) by means of

                  transformations  that copied  features from  one node  to another.

                  This remained  the standard  account of grammatical  agreement for

                  many years.


                        But   feature  copying   is  unnecessarily   procedural.  It

                  presumes,  unjustifiably, that  whenever two  nodes agree,  one of

                  them is  the source and the  other is the destination  of a copied

                  feature. In  practice,  the source  and  destination are  hard  to

                                    
                       3 Figures are printed at the end of this document.


                                                                                    4


                  distinguish. Do  singular subjects  require singular verbs,  or do

                  singular verbs  require singular subjects? This  is an empirically

                  meaningless question. Moreover,  when agreement processes interact

                  to   combine  features  from  a  number  of  nodes,  the  need  to

                  distinguish   source   from  destination   introduces  unnecessary

                  clumsiness. 


                  2.3. Unification-based grammar


                        Unification-based  grammar attacks  the  same  problem  non-

                  procedurally,  by  stating  constraints  on  feature  values.  For

                  example, the rule


                  [2.3a]    PP        -->       P         NP

                                                   [case:acc]


                  says  that in  a  prepositional  phrase, the  NP  must  be in  the

                  accusative case. 


                        More precisely, rule [2.3a] says the feature structure


                              [case:acc]


                  must  be unified (merged)  with whatever  features the  NP already

                  has. If the NP already has case:acc,  all is well. If the NP has

                  no value  for case,  it acquires  case:acc.  But if  the NP  has

                  case with  some value other than  acc, the  unification fails and

                  the rule cannot apply.


                        Agreement is handled with variables, as in the rule 


               [2.3b]    S  -->         NP            VP

                                   <20>          <20>  <20>          <20>

                                   <20> person:X <20>  <20> person:X <20>

                                   <20> number:Y <20>  <20> number:Y <20>

                                   <20>          <20>  <20>          <20>


                  which requires the NP and  VP to agree in person and  number. Here

                  X  and  Y   are  variables;  person:X  merges  with the  person

                  feature of both the NP and  the VP, thereby ensuring that the same

                  value  is  present in  both places.  The  same thing  happens with

                  number.


                        Strictly speaking, the  category label (S, NP,  VP, etc.) is

                  part of the feature structure. Thus, 


                            NP

                    [case:acc]


                  is short for:


                                                                      5


                    <20>             <20>

                    <20> category:NP <20>

                    <20> case:acc    <20>

                    <20>             <20>


                  In practice, however,  the category label usually  plays a primary

                  role in parsing, and it is convenient to give it a special status.


                        Grammar  rules  can alternatively  be  written  in terms  of

                  equations  that the  feature  values must  satisfy. In  equational

                  notation, rules [2.3a] and [2.3b] become:


                  [2.3c]    PP --> P NP    NP case = acc


               [2.3d]    S --> NP VP    NP person = VP person

                                        NP number = VP number


                  or even, if the category label is to be treated as a feature,


                  [2.3e]    X --> Y Z      X category = PP

                                        Y category = P

                                        Z category = NP

                                        Z case = acc


               [2.3f]    X --> Y Z      X category = S

                                        Y category = NP

                                        Z category = VP

                                        Y person = Z person

                                        Y number = Z number


                  where X, Y, and Z are  variables. Equations are used  in PATR-II,

                  PrAtt,  and  other implementation  tools,  but not  in  the system

                  described here.


                        The  value of a feature  can itself be  a feature structure.

                  This  makes it  possible  to group  features  together to  express

                  generalizations.   For  instance,  one  can  group  syntactic  and

                  semantic features together, creating structures such as:


                        <20>     <20>             <20>   <20>

                    <20> syn:<3A> case:acc    <20>   <20>

                    <20>     <20> gender:masc <20>   <20>

                    <20>     <20>             <20>   <20>

                    <20>     <20>               <20> <20>

                    <20> sem:<3A> pred:MAN      <20> <20>

                    <20>     <20> countable:yes <20> <20>

                    <20>     <20> animate:yes   <20> <20>

                    <20>     <20>               <20> <20>


                  Then a rule can copy the  syntactic or semantic features en  masse

                  to another node, without enumerating them. 


                                                                                    6


                  2.4. A sample grammar


                        Features provide a powerful way to pass information from one

                  place  to another  in a  grammatical  description. The  grammar in

                  Figure 3  is an example. It  uses features not only  to ensure the

                  grammaticality of  the sentences  generated, but  also to  build a

                  representation of  the meaning of the  sentence. Every constituent

                  has a sem feature representing its meaning. The rules combine the

                  meanings   of  the   individual   words  into   predicate-argument

                  structures representing  the meanings of all  of the constituents.

                  The meaning of the sentence is represented by the sem feature  of

                  the topmost S node.


                        Like  all the examples given  here, this grammar is intended

                  only as a demonstration of the power of unification-based grammar,

                  not  as a viable  linguistic analysis.  Thus, for  simplicity, the

                  internal structure of  the NP is ignored and the proposal to group

                  syntactic features together is abandoned.


                        To  see how the grammar works, consider how the sentence Max

                  sees  Bill  would be  parsed bottom-up.  The  process is  shown in

                  Figure 4. First rules [c], [d], and [e] supply the features of the

                  individual words  (Figure 4a). Next the  bottom-up parser attempts

                  to build constituents. 


                        By rule  [b], sees and Bill constitute  a VP (Figure 4b). At

                  this step,  construction of a semantic  representation begins. The

                  sem  feature of the VP has as its value another feature structure

                  which contains two features: pred, the semantics of the verb, and

                  arg2, the semantics of the direct object. 


                        Rule [b] also assigns  the feature case:acc to Bill;  this

                  has no effect on the form of the noun  but would be important if a

                  pronoun had been used instead. 


                        Finally,  rule  [a] allows  the resulting  NP  and VP  to be

                  grouped  together  into  an  S  (Figure  4c).  This  rule  assigns

                  nominative case to Max and combines the semantics of the NP and VP

                  to  construct the sem  feature of the S  node, thereby accounting

                  for the meaning of the complete sentence.


                        It  would be  equally  feasible to  parse top-down.  Parsing

                  would  then begin with an  S node, expanded  to NP and  VP by rule

                  [a].  The NP  would then  expand to  Max  using rule  [d], thereby

                  supplying a value for sem of NP, and  hence also for sem:arg1 of

                  S.  Similarly, expansion  of the  VP would  supply values  for the

                  remaining features of S.


                        Crucially, it is possible (and necessary) to match variables

                  with each other before giving them values. In a top-down parse, we

                  know that  sem:arg2 of S will  have the same value  as sem:arg2

                  of VP long before we know what this value is to be.


                                                                                    7


                  2.5. Functions, paths, re-entrancy, and graphs


                        A feature can be viewed as a partial function which, given a

                  feature structure, may  or may  not yield a  value. For  instance,

                  given the structure


                       <20>     <20>             <20> <20>

                    <20> syn:<3A> case:acc    <20> <20>

                    <20>     <20> gender:masc <20> <20>

                    <20>     <20>             <20> <20>

                    <20> sem: MAN            <20>

                    <20>                     <20>


                  the feature  sem yields the  value MAN, the  feature syn  yields

                  another feature structure, and the feature tense  yields no value

                  (it is a case in which the partial function is undefined).


                        A path is a series of features that pick out an element of a

                  nested feature structure. Formally,  a path is the  composition of

                  the functions just mentioned. For example, the  path syn:case is

                  what you  get by applying the  function case to the  value of the

                  function syn;  applied to the  structure above, syn:case  yields

                  the value acc. Path notation  provides a way to refer to a single

                  feature  deep  in  a   nested  structure  without  writing  nested

                  brackets. Thus one can write rules such as


                        P  -->    P       NP

                                 [syn:case:acc]


                  or, in equational form,


                        P --> P NP          NP syn case = acc


                        Feature structures are re-entrant.  This means that features

                  are like pointers; if two  of them have the same value,  then they

                  point  to the  same object,  not to two  similar objects.  If this

                  object  is  subsequently  modified  (e.g.,  by  giving  values  to

                  variables),  the  change will  show up  in  both places.  Thus the

                  structure     

                                                 
                        <20>     <20>                  

                    <20> a:b <20>                  

                    <20> c:b <20>                  

                    <20> e:d <20>                  

                    <20>     <20>                  


                  is more accurately represented as something like:


                                                                      8


                    <20>           <20>

                    <20> a <20><><EFBFBD><EFBFBD><EFBFBD> b <20>

                    <20> c <20><><EFBFBD>     <20>

                    <20> e <20><><EFBFBD><EFBFBD><EFBFBD> d <20>

                    <20>           <20>


                  There is only one b, and a and c both point to it.


                        Re-entrant  feature structures can be formelized as directed

                  acyclic graphs (DAGs) as shown in Figure 5. Features are arcs  and

                  feature values  are the vertices or subgraphs found at the ends of

                  the arcs. A path is a series of arcs chained together.


                        Re-entrancy  follows  from the  way  variables  behave in  a

                  grammar. All occurrences  of the  same variable take  on the  same

                  value at the  same time.  (As in Prolog,  like-named variables  in

                  separate  rules are not considered  to be the  same variable.) The

                  value may  itself contain a variable  that will later get  a value

                  from somewhere  else. This is  why bottom-up and  top-down parsing

                  work equally well.


                  2.6. Unification


                        Our  sample  grammar  relies  on the  merging  of  partially

                  specified feature  structures. Thus,  the subject of  the sentence

                  gets case from  rule [a] and semantics from rule  [d] or [e]. This

                  merging  can be  formalized as  unification.   The unifier  of two

                  feature structures A  and B  is the smallest  feature structure  C

                  that contains all the information in both A and B. 


                        Feature  structure  unification   is  equivalent  to   graph

                  unification, i.e., merging of  directed acyclic graphs, as defined

                  in  graph theory. The unifier of two  graphs is the smallest graph

                  that contains all the nodes and  arcs in the graphs being unified.

                  This is  similar but  not identical  to  Prolog term  unification;

                  crucially, elements of the structure  are identified only by name,

                  not (as in Prolog) by position.


                        Formally,  the unification  of  feature structures  A and  B

                  (giving C) is defined as follows:


                        (1) Any feature that occurs in A  but not B, or in B but not

                        A, also occurs in C with the same value.


                        (2) Any feature that occurs  in both A and B also  occurs in

                        C, and its value in C is the unifier  of its values in A and

                        B. 


                  Feature values, in turn, are unified as follows:


                                                                                    9


                        (a) If both values are atomic symbols, they must be the same

                        atomic symbol,  or else  the unification fails  (the unifier

                        does not exist).


                        (b)  A variable  unifies with  any object  by  becoming that

                        object.   All  occurrences   of  that   variable  henceforth

                        represent the  object with  which the variable  has unified.

                        Two  variables can unify with each other, in which case they

                        become the same variable.


                        (c) If both  values are  feature structures,  they unify  by

                        applying this process recursively.


                  Thus


                        <20>     <20>         <20>     <20>

                    <20> a:b <20>   and   <20> c:d <20>

                    <20> c:d <20>         <20> e:f <20>

                    <20>     <20>         <20>     <20>

                                                                                 
                  unify giving:

                                  
                        <20>     <20>

                    <20> a:b <20>

                    <20> c:d <20>

                    <20> e:f <20>

                    <20>     <20>


                  Likewise, [a:X] and  [a:b] unify,  instantiating X to  the value

                  b; and

                             
                        <20>     <20>         <20>     <20>

                    <20> a:X <20>   and   <20> a:c <20>

                    <20> b:c <20>         <20> b:Y <20>

                    <20>     <20>         <20>     <20>

                                                      
                  unify by instantiating both X and Y to c. 


                        As  in   Prolog,   unification  is   not  always   possible.

                  Specifically, if A and B have different (non-unifiable) values for

                  some feature,  unification fails.  A grammar  rule requiring  A to

                  unify with B cannot apply if A and B are not unifiable. 


                        Unification-based grammars rely on failure of unification to

                  rule out  ungrammatical sentences. Consider, for  example, why our

                  sample  grammar generates Max sees me but  not Me sees Max. In Max

                  sees  me, both  rule [b]  and  rule [f]  specify that  me has  the

                  feature case:acc, giving the structure shown in Figure 6. 


                        However, in Me  sees Max, the case of me  raises a conflict.

                  Rule  [a] specifies  case:nom and  rule [f]  specifies case:acc.


                                                                                    10


                  These values  are not  unifiable; hence  the specified  merging of

                  feature  structures  cannot go  through, and  the sentence  is not

                  generated by the grammar.

                                                                         
                  2.7. Declarativeness


                        Unification-based grammars are declarative,  not procedural.

                  That is,  they are  statements of well-formedness  conditions, not

                  procedures for generating  or parsing sentences. That  is why, for

                  example, sentences generated by  our sample grammar can  be parsed

                  either bottom-up or top-down.


                        This declarativeness comes from the fact that unification is

                  an order-independent operation. The unifier of A, B, and C  is the

                  same regardless of  the order  in which the  three structures  are

                  combined. This is true  of both graph unification and  Prolog term

                  unification.


                        The declarative nature  of UBGs is  subject to two  caveats.

                  First,  although  unification  is   order-independent,  particular

                  parsing algorithms are not. Recall that grammar rules of the form


                              A --> A B


                  cannot  be parsed  top-down, because they  lead to  infinite loops

                  ("To  parse an A, parse an A and then..."). Now consider a rule of

                  the form


                                A   -->    A     B

                         [f:X]      [f:Y]


                  If X  and Y  have different  values, then  top-down parsing  works

                  fine; if either X or Y does not have  a value at the time the rule

                  is invoked, top-down parsing will lead  to a loop. This shows that

                  one cannot simply give an arbitrary UBG to an arbitrary parser and

                  expectuseful results;the order ofinstantiation mustbe keptin mind.


                        Second,  many   common  Prolog  operations  are  not  order-

                  independent,  and this  must be  recognized in  any implementation

                  that allows  Prolog  goals  to be  inserted  into  grammar  rules.

                  Obviously,  the  cut  (!)  interferes  with order-independence  by

                  blocking  alternatives   that   would  otherwise   succeed.   More

                  commonplace  predicates such  as write, is,  and ==  lack order-

                  independence because they behave differently  depending on whether

                  their  arguments  are  instantiated  at  the  time  of  execution.

                  Colmerauer's Prolog II  (Giannesini et  al. 1986)  avoids some  of

                  these difficulties  by allowing  the programmer to  postpone tests

                  until a variable becomes instantiated, whenever that may be.


                  2.8. Building structures and moving data


                                                                                    11


                        Declarative unification-based  rules do more than  just pass

                  information up and down the tree. They can build structure as they

                  go. For example, the rule


                                VP           -->      V           NP

                    <20>     <20>        <20> <20>        <20>       <20>    <20>       <20>

                    <20> sem:<3A> pred:X <20> <20>        <20> sem:X <20>    <20> sem:Y <20>

                    <20>     <20> arg:Y  <20> <20>        <20>       <20>    <20>       <20>

                    <20>     <20>        <20> <20>                         


                  builds on the VP node a pred-arg structure that is absent on the V

                  and NP.

                     
                        Unification can pass information  around in directions other

                  than  along  the  lines of  the  tree  diagram.  This is  done  by

                  splitting a feature into  two sub-features, one for input  and the

                  other  for output.  The  inputs and  outputs  can then  be  strung

                  together in any manner.


                        Consider for example the rule:


                       S           -->         NP                   VP

               <20>     <20>        <20> <20>      <20>     <20>        <20> <20>   <20>     <20>        <20> <20>

               <20> sem:<3A> in:X1  <20> <20>      <20> sem:<3A> in:X1  <20> <20>   <20> sem:<3A> in:X2  <20> <20>

               <20>     <20> out:X3 <20> <20>      <20>     <20> out:X2 <20> <20>   <20>     <20> out:X3 <20> <20>

               <20>     <20>        <20> <20>      <20>     <20>        <20> <20>   <20>     <20>        <20> <20>


                  This  rule assumes  that  sem of  the  S has  some  initial value

                  (perhaps an empty list) which is  passed into X1 from outside. X1

                  is then  passed to the NP,  which modifies it in  some way, giving

                  X2, which  is  passed to  the  VP  for further  modification.  The

                  output of the VP is X3, which becomes the output of the S.


                        Such a rule is still declarative and can work either forward

                  or backward; that  is, parsing  can still take  place top-down  or

                  bottom-up.  Further, any node in the tree can communicate with any

                  other  node via  a string  of input and  output features,  some of

                  which  simply pass  information  along unchanged.  The example  in

                  section 4.2 below uses input and output features to undo unbounded

                  movements of words. Johnson and Klein (1985, 1986) use in and out

                  features to  perform complex manipulations  of semantic structure;

                  see section 4.3  (below) for a GULP reconstruction  of part of one

                  of their programs.


                  3. The GULP translator


                  3.1. Feature structures in GULP


                                                                                    12


                        The  key  idea of  GULP is  that  feature structures  can be

                  included in Prolog programs as ordinary  data items. For instance,

                  the feature structure


                             <20>     <20>

                         <20> a:b <20>

                         <20> c:d <20>

                         <20>     <20>


                  is written:


                              a:b..c:d


                  and GULP  translates a:b..c:d  into an  internal  representation

                  (called a value  list) in which the  a position is occupied by  b,

                  the c position is occupied by  d, and all other positions, if any,

                  are uninstantiated.


                        This  is analogous  to  the way  ordinary Prolog  translates

                  strings such as "abc" into lists of ASCII codes.     The     GULP

                  programmer always uses feature  structure notation and never deals

                  directly  with   value  lists.   Feature  structures   are  order-

                  independent; the translations  of a:b..c:d and of  c:d..a:b are

                  the same.


                        Nesting and paths are permitted. Thus, the structure


                              <20>           <20>

                         <20> a:b       <20>

                         <20>   <20>     <20> <20>

                         <20> c:<3A> d:e <20> <20>

                         <20>   <20> f:g <20> <20>

                         <20>   <20>     <20> <20>


                  is  written  a:b..c:(d:e..f:g).4  The  same  structure  can  be

                  written as


                              <20>       <20>

                         <20> a:b   <20>

                         <20> c:d:e <20>

                         <20> c:f:g <20>

                         <20>       <20>


                  which GULP renders as a:b..c:d:e..c:f:g.


                        GULP feature structures  are data items --  complex terms --

                  not  statements  or operations.  They  are most  commonly  used as

                  arguments. Thus, the rule

                                    
                       4 Arity Prolog 4.0 requires a space before the

               '('.


                                                                      13


                          S     -->         NP            VP

                    <20>          <20>       <20>          <20>  <20>          <20>

                    <20> person:X <20>       <20> person:X <20>  <20> person:X <20>

                    <20> number:Y <20>       <20> number:Y <20>  <20> number:Y <20>

                    <20>          <20>       <20>          <20>  <20>          <20>


                  can be written in DCG notation, using GULP, as:


                        s(person:X..number:Y) -->

                         np(person:X..number:Y),

                         vp(person:X..number:Y).


                  They  can also  be processed  by  ordinary Prolog  predicates. For

                  example, the predicate


                        nonplural(number:X) :- nonvar(X), X \= plural.


                  succeeds if and only if its  argument is a feature structure whose

                  number feature is instantiated to some value other than plural.


                        Any  feature  structure  unifies  with  any  other   feature

                  structure  unless  prevented  by  conflicting  values.  Thus,  the

                  internal  representations   of  a:b..c:d  and  c:d..e:f  unify,

                  giving a:b..c:d..e:f. But a:b  does not unify with a:d because

                  b and d do not unify with each other.


                  3.2. GULP syntax


                        Formally,  GULP adds to Prolog  the operators  `:' and `..'

                  and a wide range of built-in predicates. The operator `:'  joins a

                  feature  to  its  value,  which  itself  can  be  another  feature

                  structure. Thus in c:d:e, the value of c is d:e.


                        A  feature-value  pair  is  the  simplest  kind  of  feature

                  structure.  The  operator `..'  combines  feature-value  pairs  to

                  build more  complex feature  structures.5 This  is done  by simply

                  unifying  them.  For  example,  the   internal  representation  of

                  a:b..c:d is  built by  unifying the  internal representations of

                  a:b and c:d.


                        This  fact can  be  exploited to  write "improperly  nested"

                  feature structures. For example,


                              a:b..c:X..c:d:Y..Z


                  denotes a feature structure in which:


                       5 For compatibility with earlier versions, `..'

               can also be written `::'.


                                                                                    14


                              the value of a is b,


                              the value of c unifies with X,


                              the value of c also unifies with d:Y, and


                              the whole structure unifies with Z. 


                        Both operators,  `:' and `..', are  right-associative; that

                  is,  a:b:c   =   a:(b:c)  and  A..B..C =  A..(B..C). Arity

                  Prolog 4.0 requires an intervening space when `:' or `..'  occurs

                  adjacent  to   a  left   parenthesis;  other  Prologs   lack  this

                  restriction.


                  3.3. Built-in predicates


                        GULP 2.0  is an ordinary Prolog environment with some built-

                  in predicates added.  The most important of these is  load, which

                  loads  clauses  into  memory   through  the  GULP  translator.  (A

                  consult  or reconsult  would  not translate  feature structures

                  into their internal representations.) Thus,


                        ?- load myprog.


                  loads clauses from the file MYPROG.GLP.


                        Like reconsult, load clears away any pre-existing clauses

                  for a predicate when new clauses for that predicate (with the same

                  arity)  are first encountered  in a file. However,  load does not

                  require  the clauses for a predicate  to be contiguous, so long as

                  they all occur in the same file. A program can  consist of several

                  files that are loaded into memory together.


                        Another predicate, ed,  calls a full-screen editor and  then

                  loads the  file. Without  an argument, ed  or load uses  the same

                  file name as on the most recent invocation of either ed or load.


                        Other special  predicates are used within  the program. GULP

                  1.1 required a declaration such as


                        g_features([gender,number,case,person,tense]).


                  declaring all feature names before any were used. This declaration

                  is  optional in GULP 2.0. If present,  it establishes the order in

                  which features will appear whenever a feature structure is output,

                  and it can be used  to optimize the program by putting  frequently

                  used features  at  the  beginning.  Further, whether  or  not  the

                  programmer includes a g_features declaration, GULP 2.0 maintains

                  in memory an up-to-date g_features clause with a list of all the

                  features  actually  used,   in  the  order  in   which  they  were

                  encountered.


                                                                                    15


                        The   predicate   g_translate/2   interconverts   feature

                  structures and  their  internal  representations.  This  makes  it

                  possible  to  process,  at  runtime, feature  structures  in  GULP

                  notation rather  than translated  form. For  instance, if  X is  a

                  feature   structure,  then  g_translate(Y,X),  write(Y)  will

                  display it in GULP notation.


                        The   predicate  display_feature_structure   outputs  a

                  feature  structure, not  in  GULP notation,  but  in a  convenient

                  tabular format, thus:


                        syn: case: acc

                         gender: masc

                    sem: pred: MAN

                         countable: yes

                         animate: yes


                  This  is similar  to traditional  feature structure  notation, but

                  without brackets.


                  3.4. Internal representation


                        The  nature  of   value  lists,   which  represent   feature

                  structures  internally,   is  best  approached  by   a  series  of

                  approximations.  The  nearest  Prolog  equivalent  to   a  feature

                  structure is a  complex term  with one position  reserved for  the

                  value of every feature. Thus


                        <20>               <20>

                    <20> number:plural <20>

                    <20> person:third  <20>

                    <20> gender:fem    <20>

                    <20>               <20>


                  could    be    represented    as     x(plural,third,fem)    or

                  [plural,third,fem] or  the like. It is  necessary to decide in

                  advance which argument position corresponds to each feature. 

                                       
                        A feature structure that  does not use all of  the available

                  features is equivalent to a term with anonymous variables; thus

                                       
                        <20>              <20>

                    <20> person:third <20>

                    <20>              <20>


                  would be represented as x(_,third,_) or [_,third,_].


                        Structures of  this type  simulate graph unification  in the

                  desired  way.     They  can  be   recursively  embedded.  Further,

                  structures  built by instantiating Prolog variables are inherently


                                                                                    16


                  re-entrant, since  an instantiated  Prolog variable is  actually a

                  pointer to the memory representation of its value. 


                        All  the feature structures  in a program  must be unifiable

                  unless  they contain conflicting  values. Accordingly,  if fifteen

                  features  are used in the  program, every value  list must reserve

                  positions  for all fifteen. One option would be to represent value

                  lists as 15-argument structures:


               tense:present  =>   x(_,_,_,_,present,_,_,_,_,_,_,_,_,_,_)


                        This obviously wastes memory. A better solution would  be to

                  use lists; a  list with  an uninstantiated tail  unifies with  any

                  longer list. The improved representation is:


                  tense:present  =>   [_,_,_,_,present|_]


                  By  putting  frequently used  features  near  the beginning,  this

                  representation can save a considerable amount of memory as well as

                  reducing the time needed  to do unifications. Further,  lists with

                  uninstantiated tails gain length automatically as further elements

                  are  filled in;  unifying [a,b,c|_]  with [_,_,_,_,e|_]  gives

                  [a,b,c,_,e|_].


                        If most  of  the lists  in the  program have  uninstantiated

                  tails, the program  can be  simplified by requiring  all lists  to

                  have  uninstantiated tails.  Any process  that searches  through a

                  list  will then need to  check for only  one terminating condition

                  (remainder of  list uninstantiated) rather than  two (remainder of

                  list uninstantiated or empty).


                        But  the  GULP internal  value  list  structure  is  not  an

                  ordinary list. If it were,  translated feature structures would be

                  confused with  ordinary Prolog  lists, and programmers  would fall

                  victim to unforeseen unifications. It would also be impossible  to

                  test whether a term is a value list.


                        Recall that Prolog  lists are held  together by the  functor

                  `.'. That is,


                              [a,b,c|X]   =   .(a,.(b,.(c,X)))


                  To get a  distinct type of list,  all we need to do  is substitute

                  another  functor for  the  dot. GULP  uses  g_/2. (In  fact,  all

                  functors beginning  with  g_ are  reserved  by  GULP for  internal

                  use.) So  if tense is the  fifth feature in the  canonical order,

                  then


                  tense:present  =>   g_(_,g_(_,g_(_,g_(_,g_(present,_)))))


                  It  doesn't matter that this looks ugly; the GULP programmer never

                  sees it.


                                                                                    17


                        One  more refinement  (absent  before GULP  version 2.0)  is

                  needed. We  want to  be able  to translate  value lists  back into

                  feature structure  notation. For this purpose  we must distinguish

                  features  that  are  unmentioned  from features  that  are  merely

                  uninstantiated. That is, we do not  want tense:X to turn  into an

                  empty feature structure  just because X is uninstantiated.  It may

                  be useful  to know,  during program  testing, that  X has  unified

                  with some other  variable even  if it  has not  acquired a  value.

                  Thus, we  want  to  record,  somehow,  that  the  variable  X  was

                  mentioned in the original feature  structure whereas the values of

                  other features (person, number, etc.) were not.


                        Accordingly, g_/1 (distinct from g_/2) is used to mark all

                  features  that  were  mentioned  in  the  original  structure.  If

                  person is second  in the canonical order, and  tense is fifth in

                  the canonical order (as before), then


                  tense:present..person:X =>

                              g_(_,g_(g_(X),g_(_,g_(_,g_(g_(present),_)


                  And  this is the representation  actually used by  GULP. Note that

                  the  use of  g_/1  does not  interfere with  unification, because

                  g_(present)   will  unify  both   with  g_(Y)   (an  explicitly

                  mentioned variable) and with an empty position.


                  3.5. How translation is done


                        GULP loads a program by reading it, one term a  a time, from

                  the input file, and translating all the feature structures in each

                  term into  value lists. The  term is then  passed to  the built-in

                  predicate  expand_term,  which  translates  grammar  rule  (DCG)

                  notation into plain  Prolog. The result is then asserted into the

                  knowledge  base. There are two exceptions: a term that begins with

                  `:-' is  executed immediately, just as  in ordinary  Prolog, and a

                  g_features  declaration   is  given  special  treatment   to  be

                  described below.


                        To make translation possible, GULP maintains a stored set of

                  forward  translation  schemas,  plus  one  backward   schema.  For

                  example,  a  program   that  uses  the  features  a,  b,   and  c

                  (encountered in that  order) will  result in the  creation of  the

                  schemas:


                  g_forward_schema(a,X,g_(X,_)).

               g_forward_schema(b,X,g_(_,g_(X,_))).

               g_forward_schema(c,X,g_(_,g_(_,g_(X,_)))).


               g_backward_schema(a:X..b:Y..c:Z,g_(X,g_(Y,g_(Z,_)))).


                  Each  forward schema contains a  feature name, a  variable for the

                  feature  value,  and  the  minimal corresponding  value  list.  To


                                                                                    18


                  translate the  feature  structure a:xx..b:yy..c:zz,  GULP  will

                  mark each  of the feature values with  g_(...), and then call, in

                  succession,


                  g_forward_schema(a,g_(xx), ... ),

               g_forward_schema(b,g_(yy), ... ),

               g_forward_schema(c,g_(zz), ... ) ...


                  and  unify the resulting value lists.  The result will be the same

                  regardless of the order in which the calls are made. To  translate

                  a complex  Prolog term, GULP first  converts it into  a list using

                  `=..', then recursively translates all  the elements of the  list

                  except the first, then converts the result back into a term.


                        Backward  translation is  easier;  GULP  simply unifies  the

                  value list with  the second argument of g_backward_schema,  and

                  the first  argument immediately yields a rough  translation. It is

                  rough in  two ways: it mentions  all the features in  the grammar,

                  and it contains g_(...) marking  all the feature values that were

                  mentioned  in   the  original  feature  structure.   The  finished

                  translation is  obtained by  discarding all features  whose values

                  are not marked by g_(...), and removing the g_(...) from values

                  that contain it.


                        The translation  schemas are built automatically. Whenever a

                  new feature is encountered, a forward schema is built for  it, and

                  the pre-existing backward  schema, if  any, is replaced  by a  new

                  one. A g_features declaration causes the immediate generation of

                  schemas  for all  the  features  in it,  in  the  order given.  In

                  addition,  GULP maintains  a  current g_features  clause at  all

                  times that lists all the features actually encountered, whether or

                  not they were originally declared.


                  4. GULP in practical use


                  4.1. A simple definite clause grammar


                        Figure 7  shows the grammar  from Figure 3  implemented with

                  the  definite  clause grammar  (DCG)  parser  that  is built  into

                  Prolog. Each nonterminal  symbol has a  GULP feature structure  as

                  its only argument. 


                        Parsing is done top-down. The output of the program reflects

                  the feature structures built during parsing. For example:


                        ?- test1.

                    [max,sees,bill]     (String being parsed)

                    sem: pred: SEES     (Displayed feature structure)

                         arg1: BILL

                         arg2: MAX


                                                                                    19


                        Figure  8 shows the same grammar written in a more PATR-like

                  style. Instead of using  feature structures in argument positions,

                  this  program  uses variables  for  arguments,  then unifies  each

                  variable  with   appropriate  feature  structures  as  a  separate

                  operation.  This is slightly less  efficient but can  be easier to

                  read,  particularly  when the  unifications  to  be performed  are

                  complex.


                        In  this  program, the  features  of np  and vp  are called

                  NPfeatures and  VPfeatures  respectively. More  commonly,  the

                  features  of np, vp, and so  on are in variables  called NP, VP,

                  and  the like.  Be careful  not to  confuse upper-  and lower-case

                  symbols.


                        The rules in Figure  8 could equally well have  been written

                  with the unifications before the  constituents to be parsed.  That

                  is, we can write either


                  s(Sfeatures) --> np(NPfeatures), vp(VPfeatures), 

                                             { Sfeatures = ... }.


                  or


                  s(Sfeatures) --> { Sfeatures = ... }, 

                                        np(NPfeatures), vp(VPfeatures).


               Because  unification  is  order-independent,  the  choice  affects

                  efficiency but  not correctness. The  only exception is  that some

                  rules can loop when written one way but not the other. Thus


                  s(S1) --> s(S2), { S1 = x:a, S2 = x:b }.


                  loops, whereas


                  s(S1) --> { S1 = x:a, S2 = x:b }, s(S2).


                  does not,  because in  the  latter case  S2 is  instantiated to  a

                  value that must be distinct from S1 before s(S2) is parsed.


                  4.2. A hold mechanism for unbounded movements


                        Unlike  a  phrase-structure  grammar,   a  unification-based

                  grammar can  handle unbounded  movements.  That is,  it can  parse

                  sentences  in which some element  appears to have  been moved from

                  its normal position across an arbitrary amount of structure.


                        Such a  movement occurs in English  questions. The question-

                  word (who, what, or  the like) always appears at  the beginning of

                  the sentence. Within  the sentence, one of the places where a noun

                  phrase could have appeared is empty:


                                                                                    20


                        The boy said the dog chased the cat.

                        What did the boy say _ chased the cat? (The dog.)

                        What did the boy say the dog chased _? (The cat.)


                  Ordinary phrase-structure rules cannot  express the fact that only

                  one  noun phrase  is missing.  Constituents introduced  by phrase-

                  structure rules are either optional or obligatory. If noun phrases

                  are obligatory, they  can't be  missing at  all, and  if they  are

                  optional, any number of them can be missing at the same time.


                        Chomsky (1957) analyzed such sentences by generating what in

                  the position of  the missing  noun phrase, then  moving it to  the

                  beginning  of the sentence by  means of a  transformation. This is

                  the generally accepted analysis.


                        To parse such sentences, one must undo the movement. This is

                  achieved through a  hold stack. On  encountering what, the  parser

                  does not parse it, but rather puts it  on the stack and carries it

                  along until it  is needed. Later, when  a noun phrase is  expected

                  but not found, the parser can pop what off the stack and use it.


                        The hold stack is a  list to which elements can be  added at

                  the  beginning. Initially,  its value is  [] (the  empty list). To

                  parse a sentence, the parser must:


                        (1)  Pass the hold stack to the  NP, which may add or remove

                              items.

                        (2)  Pass the possibly modified  stack to the  VP, which may

                              modify it further.


                  In traditional notation, the rule we need is:


                           S          -->          NP                    VP


               <20>      <20><>        <20><>     <20>      <20>         <20><>   <20>      <20>         <20><>

               <20> hold:<3A> in:  H1 <20><>     <20> hold:<3A> in:  H1 <20><>   <20> hold:<3A> in:  H2 <20><>

               <20>      <20> out: H3 <20><>     <20>      <20> out: H2 <20><>   <20>      <20> out: H3 <20><>

               <20>      <20>         <20><>     <20>      <20>         <20><>   <20>      <20>         <20><>


                  Here hold:in is  the stack  before parsing  a given  constituent,

                  and hold:out is the  stack after parsing that same  constituent.

                  Notice  that three different states  of the stack --  H1, H2, and

                  H3 -- are allowed for.


                        Figure 9 shows a  complete grammar built with rules  of this

                  type. There are two rules expanding S. One is the one above (S -->

                  NP VP). The  other one accepts  what did at  the beginning of  the

                  sentence, places what  on the stack,  and proceeds to parse  an NP

                  and  VP. Somewhere  in the  NP  or VP  --  or in  a subordinate  S

                  embedded therein -- the parser will use the rule


                  np(NP) --> [], { NP = hold: (in:[what|H1]..out:H1) }.


                                                                                    21


                  thereby removing what from the stack. 


                  4.3. Building complex semantic structures


                        Figure  10 shows  a GULP  reimplementation of  a program  by

                  Johnson and Klein  (1986) that makes extensive use of  in and out

                  features to pass  information around the  parse tree. Johnson  and

                  Klein's key insight is that the logical structure of a sentence is

                  largely  specified by the determiners.  For instance, A  man saw a

                  donkey expresses  a simple proposition with universally quantified

                  variables,  but Every  man  saw a  donkey  expresses an  "if-then"

                  relationship (If X is a man then X saw a donkey). On the syntactic

                  level, every  modifies only  man, but  semantically, it gives  the

                  entire sentence a different structure.


                        Accordingly, Johnson and  Klein construct  their grammar  so

                  that  almost   all  the  semantic   structure  is  built   by  the

                  determiners. Each  determiner must receive, from  elsewhere in the

                  sentence,  semantic   representations  for   its  scope   and  its

                  restrictor. The scope of a determiner is the main predicate of the

                  clause, and the  restrictor is an additional condition  imposed by

                  the NP to which the determiner belongs. For instance, in Every man

                  saw a  donkey, the  determiner every  has scope  saw a  donkey and

                  restrictor man.


                        Figure  10 shows  a reimplementation,  in GULP, of  a sample

                  program Johnson and Klein wrote in PrAtt (a different extension of

                  Prolog). The  semantic representations  built by this  program are

                  those   used  in  Discourse  Representation  Theory  (Kamp,  1981;

                  Spencer-Smith,  1987). The meaning  of a sentence  or discourse is

                  represented by a discourse representation structure (DRS) such as:


                        [1,2,man(1),donkey(2),saw(1,2)]


                  Here 1 and 2 stand for  entities (people or things), end man(1),

                  donkey(2),  and saw(1,2)  are  conditions  that these  entities

                  must  meet. The discourse is  true if there  are two entities such

                  that 1 is a man, 2  is a donkey, and 1  saw 2. In other words,  "A

                  man  saw  a   donkey."  The   order  of  the   list  elements   is

                  insignificant,  and the  program  builds the  list backward,  with

                  indices and conditions mixed together.


                        A DRS can contain other DRSes embedded in a variety of ways.

                  In particular, one  of the  conditions within a  DRS can have  the

                  form


                    DRS1 > DRS2


                  which  means:  "This condition  is satisfied  if  for each  set of

                  entities  that satisfy DRS1, it is also possible to satisfy DRS2."

                  For example:


                                                                      22


                    [1,man(1), [2,donkey(2)] > [saw(1,2)] ]


                  "There is an entity 1 such that 1 is a man, and for every entity 2

                  that is a  donkey, 1 saw 2." That is, "Some man saw every donkey."

                  Again,


                        [ [1,man(1)] > [2,donkey(2)] ]


                  means "every  man saw a  donkey" -- that  is, "for every  entity 1

                  such that 1 is a man, there is an entity 2 which is a donkey."


                        Parsing a sentence begins with the rule:


                  s(S) --> { S = sem:A,         NP = sem:A,

                          S = syn:B,         VP = syn:B,

                          NP = sem:scope:C,  VP = sem:C,

                          VP = syn:arg1:D,   NP = syn:index:D }, np(NP), vp(VP).


                  This rule stipulates the following things:


                  (1) An S consists of an NP and a VP.


                  (2) The  semantic representation of the  S is the same  as that of

                  the NP, i.e., is built by the rules that parse the NP.


                  (3) The syntactic feature structure (syn) of the S is that of the

                  NP. Crucially,  this contains the  indices of the  subject (arg1)

                  and object (arg2).


                  (4)  The scope  of the  NP (and  hence of  its determiner)  is the

                  semantic representation of the VP.


                  (5) The  index of  the verb's  subject (arg1) is  that of  the NP

                  mentioned in this rule.


                  Other  rules do  comparable amounts  of work, and  space precludes

                  explaining  them in detail here. (See Johnson and Klein 1985, 1986

                  for  further explanation.)  By  unifying appropriate  in and  out

                  features, the  rules perform a  complex computation  in an  order-

                  independent way.


                  4.4. Bottom-up parsing


                        GULP is not tied  to Prolog's built-in DCG parser. It can be

                  used  with any other parser implemented in Prolog. Figure 11 shows


                                                                                    23


                  how GULP can be  used with the BUP  bottom-up parser developed  by

                  Matsumoto et al. (1986).6


                        In  bottom-up parsing, the typical question is not "How do I

                  parse an NP?" but rather,  "Now that I've parsed an NP,  what do I

                  do with it?"  BUP puts the Prolog search mechanism  to good use in

                  answering questions like this.


                        During a BUP parse, two kinds of goals occur. A goal such as


                              ?- np(s,NPf,Sf,[chased,the,cat],[]).


                  means: "An NP has  just been accepted; its features  are contained

                  in  NPf. This occurred  while looking for an  S with features Sf.

                  Immediately   after  parsing   the  NP,   the  input   string  was

                  [chased,the,cat]. After parsing the S, it will be []." 


                        The other type of goal is


                              ?- goal(vp,VPf,[chased,the,cat],[]).


                  This means "Parse a VP with features VPf, starting with the input

                  string [chased,the,cat] and ending  up with []." This is  like

                  the DCG goal


                              ?- vp(VPf,[chased,the,cat],[]).


                  except that the parsing is to be done bottom-up.


                        To see  how these  goals are constructed,  imagine replacing

                  the top-down parsing rule


                              s --> np, vp.


                  with the bottom-up rule


                              np, vp --> s.


                  This rule  should be used  when the parser  is looking for  a rule

                  that will tell it how to use an NP it has just found. So  np(...)

                  should  be  the  head  of  the  Prolog  clause.  Ignoring  feature

                  unifications, the clause will be:


                              np(G,NPf,Gf,S1,S3) :- goal(vp,VPf,S1,S2),

                                               s(G,Sf,Gf,S2,S3).


                       6 It has been suggested that a combination of GULP

               and BUP should be known as BURP. This suggestion has

               not been acted upon.


                                                                                    24


                  That is: "Having just  found an NP with features NPf, parse  a VP

                  with features VPf. You will then have completed an S, so look for

                  a clause that tells you what to do with it." 


                        Here  S1, S2, and S3 represent  the input string initially,

                  after parsing the VP, and after completing the  S. G is the higher

                  constituent that was  being sought when the  NP was found, and  Gf

                  contains its features. If,  when the S is completed, it  turns out

                  that an  S was being  sought (the usual case),  then execution can

                  finish with the terminal rule


                              s(s,F,F,X,X).


                  Otherwise another clause for s(...) must be searched for.


                        Much  of the  work  of  BUP  is  done  by  the  goal-forming

                  predicate goal, defined thus:


                  goal(G,Gf,S1,S3) :-

                 word(W,Wf,S1,S2),

                 NewGoal =.. [W,G,Wf,Gf,S2,S3],

                 call(NewGoal).


                  That is  (ignoring features):  "To parse a  G in  input string S1

                  leaving  the remaining  input in  S3,  first  accept a  word, then

                  construct a new  goal depending on its category (W)." For example,

                  the query


                  ?- goal(s,Sf,[the,dog,barked],S3).


                  will first call


                  ?- word(W,Wf,[the,dog,barked],[dog,barked]).


                  thereby instantiating W  to det and  Wf to  the word's  features,

                  and then construct and call the goal


                  ?- det(s,Wf,Sf,[dog,barked],S3).


                  That is: "I've just completed a det and am trying to parse an  s.

                  What do I do next?" A rule such as


                  det, n --> np


                  (or  rather its  BUP equivalent)  can be  invoked next,  to accept

                  another word (a noun) and complete an NP.


                  5. Comparison with other systems


                  5.1. GULP versus PATR-II


                        25


                        PATR-II (Shieber 1986a, b) is the  most widely used software

                  tool for  implementing unification-based grammars, as  well as the

                  most  mature and sophisticated. It differs from GULP in three main

                  ways:


                  (1) Whereas GULP is an extension of Prolog, PATR-II is a new self-

                  contained programming language.


                  (2)  Whereas GULP allows the use of any parsing algorithm, PATR-II

                  provides  one specific  parser  (left-corner,  Earley,  or  Cocke-

                  Kasami-Younger, depending on the version).


                  (3) Whereas GULP  grammar rules treat  feature structures as  data

                  items, PATR-II grammar rules state equations on feature values.


                        Of these,  (3) makes  the biggest practical  difference. The

                  rule which GULP writes as


                  s(person:X..number:Y) -->

                    np(person:X..number:Y),

                    vp(person:X..number:Y).


                  (assuming use of the DCG parser) is rendered in PATR-II as:


                  Rule S --> NP VP:

                    <S number>  =  <NP number>

                    <S number>  =  <VP number>

                    <S person>  =  <NP person>

                    <S person>  =  <VP person>.


                  or the like. Paths are permitted, of course; one  could write <NP

               syn agr number> to build a more complex structure.


                        Here S, NP, and VP are not pure variables; the equations


                  <S cat>  = s

               <NP cat> = np

               <VP cat> = vp


                  (or the equivalent) are implicit. Further abbreviatory power comes

                  from templates, which are predefined  sets of features and values.

                  Thus, instead of writing the lexical entry


                  Word sleeps:  <cat> = v

                             <person> = third

                             <number> = singular

                             <subcat> = intransitive.


                  the PATR-II programmer can define the template


                        26


               Let ThirdSingVerb be  <cat> = v

                                     <person> = third

                                     <number> = singular.


                  and then write: 


                  Word sleeps:  ThirdSingVerb

                             <subcat> = intransitive.


               Word chases:  ThirdSingVerb

                             <subcat> = transitive. 


                  The GULP equivalent of a template is a Prolog fact such as: 


                  thirdsingverb(person:third..number:singular).


                  Lexical entries can then use this as an abbreviatory device:


                  v(Vf) --> [sleeps], { thirdsingverb(Vf),

                                     Vf = subcat:intransitive }.


               v(Vf) --> [chases], { thirdsingverb(Vf), 

                                     Vf = subcat:transitive }.


                  (There is no cat:v here because in the DCG parser, categories are

                  functors rather than feature values.)


                        Unlike GULP, PATR-II provides  for default inheritance. That

                  is, the programmer can  invoke a template and then change  some of

                  the values that it supplies, thus:


                  Word does:  ThirdSingVerb

                           <cat> = auxverb.


                  This means: "Does is  a ThirdSingVerb except that its  category is

                  not v  but rather  auxverb." PATR-II  also provides  for lexical

                  redundancy rules  that transform  one lexical entry  into another,

                  e.g., building a passive verb from every active verb. 


                        Both  of these capabilities are absent from GULP per se, but

                  they could  be built into  a parser written in  GULP. Indeed, many

                  contrasts between GULP and PATR-II  reflect the fact that  PATR-II

                  is a custom-built environment for implementing grammars that fit a

                  particular mold, while GULP  is a minimal extension to a much more

                  general-purpose programming language.


                        One advantage of GULP is that the full range of  Prolog data

                  structures is  available. Shieber (1986a:28-32)  equips each  verb

                  with  an  ordered list  of NPs  that  are its  syntactic arguments

                  (subject, object, etc.).  But there  are no lists  in PATR-II,  so

                  Shieber has to construct them as nested feature structures:


                        27


               <20>                                                           <20>

               <20> first: ...first element...                                <20>

               <20>       <20>                                                 <20> <20>

               <20> rest: <20> first: ...second element...                     <20> <20>

               <20>       <20>       <20>                                       <20> <20> <20>

               <20>       <20> rest: <20> first:  ...third element...           <20> <20> <20>

               <20>       <20>       <20>       <20>                             <20> <20> <20> <20>

               <20>       <20>       <20> rest: <20> first: ...fourth element... <20> <20> <20> <20>

               <20>       <20>       <20>       <20> rest:  end                  <20> <20> <20> <20>

               <20>       <20>       <20>       <20>                             <20> <20> <20> <20>


                  This  may be desirable on grounds of theoretical parsimony, but it

                  is   notationally  awkward.   In  GULP,   one  can   simply  write

                  [X1,X2,X3,X4],  where  X1,  X2,  X3,  and  X4  are  variables,

                  constants, feature structures, or terms of any other kind.


                  5.2. GULP versus PrAtt


                        PrAtt (Prolog with Attributes), described briefly by Johnson

                  and  Klein (1986), is a  PATR-like extension of  Prolog. In PrAtt,

                  feature  structure  equations  are  treated as  Prolog  goals.  An

                  example is the DCG rule:


                  s(Sf) --> np(NPf), vp(VPf),

                          { Sf:number = NPf:number,

                            Sf:number = VPf:number,

                            Sf:person = NPf:person,

                            Sf:person = VPf:person }.


                  This  looks almost like GULP syntax, but the meaning is different.

                  NPf:number  is  not  a  Prolog term,  but  rather  an  evaluable

                  expression; at  execution  time, it  is  replaced by  the  number

                  element of structure NPf.


                        Compared  to GULP, PrAtt makes  a much bigger  change to the

                  semantics  of  Prolog.  GULP  merely  translates  data  into  data

                  (changing the format from feature structures to value  lists), but

                  PrAtt translates data into extra operations.


                        An  example will  make  this clearer.  In  order to  execute

                  Sf:number  =  NPf:number,  PrAtt  must  extract  the  number

                  features of Sf and NPf,  then unify them. In Johnson  and Klein's

                  implementation, this extraction is  done at run time; that  is, on

                  find the  expression Sf:number,  the PrAtt interpreter  looks at

                  the contents of Sf, and  then replaces Sf:number with  the value

                  of Sf's number feature.


                        This implies that  the value of Sf is known. If it is not --

                  for example, if the PrAtt-to-Prolog translation is being performed

                  before  running the program --  then extra goals  must be inserted


                        28


                  into the program  to extract the  appropriate feature values.  The

                  single  PrAtt goal  Sf:number =  NPf:number becomes  at least

                  three goals:


                        (1) Unify Sf with something that  will put the number  value

                  into a unique variable (call it X).


                        (2) Unify NPf with something that will put its number value

                  into a unique variable (call it Y).


                        (3) Unify X and Y.


                        To put this  another way, whereas  GULP modifies the  syntax

                  for Prolog terms, PrAtt  modifies the unification algorithm, using

                  three  calls  to  the  existing Prolog  unification  algorithm  to

                  perform one PrAtt unification.


                  5.3. GULP versus AVAG


                        AVAG  (Attribute-Value   Grammar,   Sedogbo  1986)   is   an

                  implementation of generalized  phrase structure grammar  (GPSG), a

                  framework  for   expressing  linguistic  analyses.   A  three-pass

                  compiler translates AVAG notation into Prolog II. As such, AVAG is

                  far more complex than GULP or PrAtt, and there is  little point in

                  making  a direct  comparison. Comparing  AVAG to PATR-II  would be

                  instructive but is outside the scope of this paper.


                        AVAG  is interesting because it uses  the Prolog II built-in

                  predicates  dif  (which   means  "this  variable  must  never  be

                  instantiated  to  this  value") and  freeze  ("wait  until  these

                  variables   are  instantiated,  then   test  them")  to  implement

                  negative-valued   and  set-valued  features   respectively.    For

                  example, the rule


                        voit:

                         <cat> = verb

                         <person> /= 2

                         <number> = sing.


                  uses dif to ensure that the person feature never equals 2, and


                        chaque:

                         <cat> = art

                         <gender> = [mas,fem]

                         <number> = sing.


                  uses  freeze  to ensure  that  when  the gender  feature becomes

                  instantiated, its  value  is mas  or  fem.  There are  no  direct

                  equivalents  for  dif  or  freeze  in  conventional  (Edinburgh)


                        29


                  Prolog; they could be  implemented only by changing  the inference

                  engine.


                  5.4. GULP versus STUF


                        STUF  (Stuttgart   Formalism)  is  a  formal   language  for

                  describing  unification-based grammars.  It is  more comprehensive

                  than PATR-II and  as yet is only partly  implemented (Bouma et al.

                  1988).


                        Comparing STUF to GULP would be rather like comparing linear

                  algebra  to Fortran;  the systems  are not  in the  same category.

                  Nonetheless, STUF introduces a number of novel ideas that could be

                  exploited in parsers or other systems written in GULP.


                        The  biggest of  these  is  nondestructive  unification.  In

                  Prolog,  unification is  a destructive  operation; terms  that are

                  being unified are replaced by their unifier.  For example, if X =

               [a,_] and  Y = [_,b],  then after unifying  X and Y, X  = Y =

               [a,b]. In STUF, on the other hand, an expression such as


                        z = (x y)


                  creates a third structure  z whose value  is the unifier of  x and

                  y; x and  y themselves are unaffected. Nondestructive unification

                  can  be implemented in Prolog by copying the terms before unifying

                  them (Covington et al. 1988:204).


                        Further, if  the unification fails, z gets the special value

                  FAIL. If x  and y are feature  structures, and parts of  them are

                  unifiable but other parts are not, the non-unifiable parts will be

                  represented  by  FAIL  in  the  corresponding  parts of  z.  This

                  provides a way to implement negative-valued features. For example,

                  to ensure  that  a verb  is  not third  person singular,  one  can

                  stipulate  that   when  its  person   feature  is  unified   with

                  person:3, the result is FAIL.


                        In STUF,  a feature can also  have a set of  alternatives as

                  its value,  and  when  two structures  containing  such  sets  are

                  unified, the unifier is the set of all the  unifiers of structures

                  that would result from choosing different alternatives.


                        Finally,  STUF  exploits the  fact  that  grammar rules  can

                  themselves be treated as feature structures. For example, the rule


                        30


                         S  -->         NP            VP

                                   <20>          <20>  <20>          <20>

                                   <20> person:X <20>  <20> person:X <20>

                                   <20> number:Y <20>  <20> number:Y <20>

                                   <20>          <20>  <20>          <20>


                  (or  more precisely the tree  structure that it  sanctions) can be

                  expressed as the feature structure


                            <20>               <20>              <20>  <20>

                         <20> mother:       <20> category:  s <20>  <20>

                         <20>               <20>              <20>  <20>

                         <20>               <20>               <20> <20>

                         <20> daughter_1:   <20> category:  np <20> <20>

                         <20>               <20> person:    X  <20> <20>

                         <20>               <20> number:    Y  <20> <20>

                         <20>               <20>               <20> <20>

                         <20>               <20>               <20> <20>

                         <20> daughter_2:   <20> category:  vp <20> <20>

                         <20>               <20> person:    X  <20> <20>

                         <20>               <20> number:    Y  <20> <20>

                         <20>               <20>               <20> <20>

                                                               
                        STUF   therefore   implements  grammar   rules   via  "graph

                  application,"  an  operation  that treats  one  feature  structure

                  (directed acyclic graph) as  a function to be applied  to another.

                  Graph  application is  an operation  with four   arguments:  (1) a

                  graph expressing the  function; (2) a graph  to be treated as  the

                  argument; (3) a path indicating what part of the argument graph is

                  to be unified  with the function graph; and  (4) a path indicating

                  what part of the argument  graph should be unified (destructively)

                  with  the  result  of  the first  unification.  The  ordinary GULP

                  practice  of simply unifying one  graph with another  is a special

                  case of this.


                  6.0. Future Prospects


                  6.1. Possible improvements


                        One  disadvantage of  GULP is  that every  feature structure

                  must contain a  position for  every feature in  the grammar.  This

                  makes feature  structures larger and  slower to process  than they

                  need  be.   By  design,   unused  features   often  fall   in  the

                  uninstantiated tail of the  value list, and hence take  up neither

                  time  nor  space.  But not  all  unused  features  have this  good

                  fortune. In practice, almost every value list contains gaps, i.e.,

                  positions that will never be instantiated, but must be passed over

                  in every unification.


                        31


                        To  reduce the  number of  gaps, GULP  could be  modified to

                  distinguish different types of  value lists. The feature structure

                  for a  verb needs a feature for tense; the feature structure for a

                  noun  does not. Value lists  of different types  would reserve the

                  same  positions for  different  features,  skipping features  that

                  would never  be used. Some kind  of type marker, such  as a unique

                  functor,  would be needed so  that value lists  of different types

                  would not unify with each other.


                        Types of  feature structures  could be distinguished  by the

                  programmer   --   e.g.,   by   giving   alternative   g_features

                  declarations -- or by modifying the GULP translator itself to look

                  for patterns in the use of features.


                  6.2. Keyword parameters via GULP


                        Unification-based  grammar is  not  the only  use for  GULP.

                  Feature  structures  are  a good  formalization  of  keyword-value

                  argument lists.


                        Imagine   a  complicated   graphics  procedure   that  takes

                  arguments  indicating  desired window  size,  maximum  and minimum

                  coordinates, and  colors,  all of  which have  default values.  In

                  Pascal,  the procedure can only be called with explicit values for

                  all the parameters:


                  OpenGraphics(480,640,-240,240,-320,320,green,black);


                  There  could,  however, be  a convention  that  0 means  "take the

                  default:"


                  OpenGraphics(0,0,0,0,0,0,red,blue);


                  Prolog can  do slightly  better by using  uninstantiated arguments

                  where  defaults are  wanted, and thereby  distinguishing "default"

                  from "zero":


                  :- open_graphics(_,_,_,_,_,_,red,blue).


                  In  GULP,  however, the  argument  of  open_graphics  can  be a

                  feature structure in which the  programmer mentions only the  non-

                  default items:


                  :- open_graphics( foreground:red..background:blue ).


                  In  this  feature   structure,  the  values  for   x_resolution,

                  y_resolution,   x_maximum,   x_minimum,   y_maximum,   and

                  y_minimum (or whatever they  are called) are left uninstantiated

                  because they are not mentioned. So in addition to facilitating the


                        32


                  implementation of unification-based grammars, GULP provides Prolog

                  with a keyword argument system.


                  References


                  Barton, G.  Edward; Berwick,  Robert  C.; and  Ristad, Eric  Sven.

                        1987.   Computational   complexity  and   natural  language.

                        Cambridge, Massachusetts: MIT Press.


                  Bouma, Gosse; K<>nig, Esther; and Uszkoreit, Hans. 1988. A flexible

                        graph-unification formalism and its application  to natural-

                        language processing. IBM Journal of Research and Development

                        32:170-184.


                  Bresnan, Joan, ed. 1982.  The mental representation of grammatical

                        relations.  Cambridge, Massachusetts: MIT Press.


                  Chomsky, Noam.  1957. Syntactic structures. (Janua  linguarum, 4.)

                        The Hague: Mouton.


                  Chomsky,  Noam. 1965. Aspects of  the theory of syntax. Cambridge,

                        Massachusetts: MIT Press.


                  Covington, Michael A. 1987.  GULP 1.1: an extension of  Prolog for

                        unification-based  grammar.  ACMC  Research Report  01-0021.

                        Advanced   Computational   Methods  Center,   University  of

                        Georgia.


                  Covington,  Michael A.;  Nute, Donald;  and Vellino,  Andr<64>. 1988.

                        Prolog   programming  in   depth.  Glenview,   Ill.:  Scott,

                        Foresman.


                  Gazdar,  Gerald; Klein,  Ewan;  Pullum, Geoffrey;  and Sag,  Ivan.

                        Generalized    phrase    structure    grammar.    Cambridge,

                        Massachusetts: Harvard University Press.


                  Giannesini,  Francis;  Kanoui,  Henry;  Pasero,  Robert;  and  van

                        Caneghem, Michel. 1986. Prolog. Wokingham, England: Addison-

                        Wesley.


                  Johnson, Mark, and Klein, Ewan. 1985. A declarative formulation of

                        Discourse  Representation Theory.  Paper  presented  at  the

                        summer meeting  of the Association for  Symbolic Logic, July

                        15-20, 1985, Stanford University.


                  Johnson,  Mark, and  Klein, Ewan.  1986. Discourse,  anaphora, and

                        parsing.  Report No.  CSLI-86-63.  Center for  the Study  of

                        Language  and  Information,  Stanford  University.  Also  in

                        Proceedings of Coling86 669-675.


                        33


                  Joshi,  Aravind  K.  1986.  The  convergence  of  mildly  context-

                        sensitive grammar formalisms.  Draft distributed at Stanford

                        University, 1987.


                  Kamp, Hans. 1981.  A theory of truth  and semantic representation.

                        Reprinted  in  Groenendijk,  J.;  Janssen,  T.  M.  V.;  and

                        Stokhof, M., eds.,  Truth, interpretation, and  information.

                        Dordrecht: Foris, 1984.


                  Kaplan,  Ronald M.,  and Bresnan,  Joan.  1982. Lexical-Functional

                        Grammar:  a formal  system  for grammatical  representation.

                        Bresnan 1982:173-281.


                  Karttunen,  Lauri. 1986a.  D-PATR: a  development environment  for

                        unification-based  grammars.  Report No.  CSLI-86-61. Center

                        for  the  Study   of  Language  and  Information,   Stanford

                        University. Shortened version in Proceedings of Coling86 74-

                        80.


                  Karttunen, Lauri. 1986b. Features and values. Shieber et  al. 1986

                        (vol. 1), 17-36. Also in Proceedings of Coling84 28-33.


                  Matsumoto, Yuji; Tanaka, Hozumi; and  Kiyono, Masaki. 1986. BUP: a

                        bottom-up parsing system for  natural languages. Michel  van

                        Caneghem and  David Warren, eds., Logic  programming and its

                        applications 262-275. Norwood, N.J.: Ablex.


                  Pollard, Carl, and Sag, Ivan A. 1987. Information-based syntax and

                        semantics, vol. 1:  Fundamentals. (CSLI Lecture  Notes, 13.)

                        Center for  the Study of Language  and Information, Stanford

                        University.


                  Sedogbo, Celestin.  1986. AVAG:  an attribute/value  grammar tool.

                        FNS-Bericht   86-10.   Seminar   f<>r   nat<61>rlich-sprachliche

                        Systeme, Universit<69>t T<>bingen.


                  Shieber,  Stuart  M. 1986a.  An introduction  to unification-based

                        approaches to  grammar. (CSLI Lecture Notes,  4.) Center for

                        the Study of Language and Information, Stanford University.


                  Shieber,  Stuart M. 1986b. The  design of a  computer language for

                        linguistic information. Shieber et  al. (eds.) 1986 (vol. 1)

                        4-26.


                  Shieber, Stuart M.; Pereira, Fernando C. N.; Karttunen, Lauri; and

                        Kay, Martin,  eds. A  compilation of papers  on unification-

                        based  grammar formalisms. 2 vols. bound  as one. Report No.

                        CSLI-86-48.   Center  for   the   Study  of   Language   and

                        Information, Stanford University.


                        34


                  Spencer-Smith,    Richard.    1987.   Semantics    and   discourse

                        representation. Mind and Language 2.1: 1-26.


                  Appendix. GULP 2.0 User's Guide


                  A.1  Overview


                        GULP is  a laboratory instrument, not  a commercial product.

                  Although  reasonably  easy  to  use,  it  lacks  the  panache  and

                  sophistication  of Turbo Pascal or Arity  Prolog 5.0. The emphasis

                  is on getting the job done as simply as possible.


                        The final word  on how GULP  works is contained in  the file

                  GULP.ARI  or GULP.PL, which you should consult whenever you have a

                  question that is not answered here.


                  A.2  Installation and access


                        On the VAX, GULP is  already installed, and you reach  it by

                  the command 


                  $ gulp


                  This  puts you into a conventional Prolog environment (note: not a

                  Quintus Prolog split-screen environment)  in which the GULP built-

                  in predicates are available.


                        The IBM PC version of GULP is supplied as a modified copy of

                  Arity Prolog Interpreter  4.0. It is for use  only on machines for

                  which Arity Prolog is licensed and is not for distribution outside

                  the AI Research Group. 


                        Many of  the GULP file names  are the same as  files used by

                  the unmodified Arity Prolog Interpreter. It is therefore important

                  that GULP be installed in a different directory.


                        To  run  GULP you  also need  a  full-screen editor  that is

                  accessible by the command:


                  edit filename


                  GULP passes commands of this form to DOS when invoking the editor.

                  We usually  use AHED.COM, renamed  EDIT.COM, for the  purpose, but

                  you can use any editor that produces ASCII files.


                  A.3  How to run programs


                        35


                        GULP  is  simply  a version  of  Prolog  with more  built-in

                  predicates  added. All the functions  and features of  Prolog7 are

                  still  available and  work  exactly as  before.  GULP is  used  in

                  exactly the same way as Prolog except that:


                        (1) Programs  containing feature structures  must be  loaded

                  via the built-in predicate load, not consult or reconsult. The

                  reason is  that consult  or reconsult  would  load the  feature

                  structures into  memory without converting them  into value lists.

                  Prolog will do  this without complaining,  but GULP programs  will

                  not work. Never  use consult or reconsult to load anything that

                  contains GULP feature structures.


                        (2)  You must always invoke the editor with the GULP command

                  ed, not with  whatever command you  would use  in Prolog. This  is

                  important    because   your   ordinary   editing   command   would

                  automatically invoke reconsult after  editing; ed invokes  load

                  instead.


                        (3) You  cannot use feature  structure notation  in a  query

                  because   queries  do not  go through  the translator.  Write your

                  program so  that  you  can invoke  all  the  necessary  predicates

                  without having to type feature structures on the command line.


                  A.4  Built-in predicates usually used as commands


                  ?- load filename.


                  Loads  the program  on  file filename  into  memory via  the  GULP

                  translator.  load  is   like  reconsult  in  that,  whenever  a

                  predicate  is  encountered  that   is  already  defined,  the  old

                  definition is discarded before the new definition  is loaded. This

                  means that  all clauses defining a  predicate must be on  the same

                  file,  but  they need  not be  contiguous.  Further, you  can load

                  definitions of different  predicates from different  files without

                  conflict.  Further, you can embed a call  to load in a program to

                  make it load another program.


                  If the file  name is not in  quotes, the ending .GLP  is added. If

                  the  file name  contains  a period,  it  must be  typed in  quotes

                  (single or double).


                       7 Except the module system, which uses the colon

               (':') for its own purposes and conflicts with GULP

               syntax.


                        36


               ?- load.


                  Loads (again)  the file that was  used in the most  recent call to

                  load or ed.


                  ?- ed filename.


                  Calls the editor to process file filename, then loads that file.


                  ?- ed.


                  Edits and loads (again) the file  that was used in the most recent

                  call to load or ed.


                  ?- list P/N.


                  Lists  all clauses that define  the predicate P  with N arguments,

                  provided these clauses  were loaded with  ed or  load. (Note:  In

                  the  case of grammar rules,  the number of  arguments includes the

                  arguments  automatically  supplied  by  the  Prolog  grammar  rule

                  translator.)


                  ?- list P.


                  Lists all clauses that  define the predicate P with any  number of

                  arguments, provided these clauses were loaded with ed or load. 


                  ?- list.


                  Lists all clauses that were loaded with ed or load.


                  ?- new.


                  Clears the workspace;  removes from memory  all clauses that  were

                  loaded with  ed  or load.  (Does  not  delete clauses  that  were

                  placed into memory by consult, reconsult, or assert.)


                  A.5  Built-in predicates usually used within the program


                        37


               g_translate(FeatureStructure,ValueList)


                  Translates a feature structure  into a value list, or  vice versa.

                  Used   when   you   must   interconvert   internal  and   external

                  representations at run time  (e.g., to input or output  them). For

                  example,  the following will  accept a  feature structure  in GULP

                  notation  from the keyboard, translate  it into a  value list, and

                  pass the value list to your predicate test:


                              ?- read(X), g_translate(X,Y), test(Y).


                  The  following translates a feature  structure X  from internal to

                  external representation and prints it out:


                              ... g_translate(Y,X), write(Y).


                  display_feature_structure(X)


               Displays  X in a convenient  indented notation  (not GULP syntax),

                  where X is either a feature structure or a value list.


                  g_display(X)


                  Equivalent  to  display_feature_structure(X);  retained  for

                  compatibility with GULP 1.


                  g_printlength(A,N)


                  Where A is  an atom, instantiates  N to the  number of  characters

                  needed  to  print it.  Useful  in constructing  your  own indented

                  output routines.


                  writeln(X)


                  If  X is a list, writes  each element of  X on a separate line and

                  then  begins a new  line. If X  is not a  list, writes  X and then

                  begins a new line. Examples:


                        writeln('This is a message.').


                    writeln(['This is a','two-line message.']).


                  Lists  within lists are not processed recursively. The elements of

                  the outermost list are printed one  per line, and lists within  it

                  are printed as lists.


                        38


               append(List1,List2,List3)


                  Concatenates  List1 and  List2 giving  List3, or  splits List3

                  into List1 and List2.


                  member(Element,List)


                  Succeeds if  Element is  an  element of  List. If  Element  is

                  uninstantiated,  it will  be instantiated,  upon  backtracking, to

                  each successive element of List.


                  remove_duplicates(List1,List2)


                  Removes duplicate elements from List1 giving List2.


                  retractall(P)


                  Retracts (abolishes) all clauses whose predicate is P.


                  phrase(Constituent,InputString)


                  Provides a simplified way to call a parser written with DCG rules;

                  for  example,  the   goal  ?- phrase(s,[the,dog,barks])   is

                  equivalent to ?- s([the,dog,barks],[]).


                  copy(X,Y)


                  Copies term  X, giving  term Y. These  terms are  the same  except

                  that  all  uninstantiated  variables  in  X  are  replaced by  new

                  uninstantiated  variables  in   Y,  arranged  in  the   same  way.

                  Variables in Y can then be instantiated without affecting X.


                  call_if_possible(Goal) 


                  Executes Goal, or fails without an error message  if there are no

                  clauses  for Goal. (In Quintus Prolog, the program crashes with an

                  error message if  there is an  attempt to query  a goal for  which

                  there are no clauses.)


                  g_fs(X)


                  Succeeds if X is an  untranslated feature structure, i.e.,  a term

                  whose principal functor is ':', '..', or '::'.


                        39


               g_not_fs(X)


                  Succeeds if X is not an untranslated feature structure.


                  g_vl(X)


                  Succeeds if  X is a  value list (the internal  representation of a

                  feature structure).


                  A.6  Other built-in predicates


                  g_ed_command(X)


                  Instantiates X to the command  presently used to call  the editor.

                  To  call  a different  editor, assertz  your  own clause  for this

                  predicate (e.g., 'g_ed_command(emacs)').


                  g_herald(X)


                  Instantiates  X to  an  atom identifying  the  current version  of

                  GULP.


                  A.7  Differences between GULP 1.1 and 2.0


                  (1)  g_features declarations  are  no longer  required, but  are

                  still permitted, and, if used, need not be complete.


                  (2) The operator '..' is now preferred in place of '::'. However,

                  the older form can still be used.


                  (3)  There have been minor  changes in the  operator precedence of

                  ':'  and '..'  relative  to  other operators.  This  is extremely

                  unlikely  to  cause  problems  unless  you  have  written  feature

                  structures that contain other operators such as '+' or '-'. 


                  (4) GULP 2.0 distinguishes between features that are mentioned but

                  uninstantiated, and features that are never mentioned. Previously,

                  g_display never printed out any uninstantiated features. 


                  (5)  Bugs  have  been corrected.  Translation  of  value  lists to

                  feature structures works correctly.


                  (6) Some rarely used built-in predicates have been deleted. In all

                  cases these  predicates had more  common synonyms  (ed rather than

                  g_ed, list rather than g_list, etc.).


                        40


                  (7) list translates feature  structures into GULP notation before

                  displaying them. (A debugger with the same capability  is foreseen

                  in the future.)


                  (8) Nested  loads are now supported. That is, a file being loaded

                  can contain a directive such  as ':- load file2.' which will be

                  executed correctly.


                        41


          Figure 1. A syntactic tree (based on Chomsky 1957).


                      S


               NP            VP


           D        N         V


          The      dog      barks.


          Figure 2. The same tree with features added.


                                      S


                     NP                               VP

              <20>              <20>                 <20>               <20>

              <20> num:singular <20>                 <20> num:singular  <20>

              <20>              <20>                 <20> pers:3rd      <20>

                                               <20> tense:present <20>

                                               <20>               <20>

                                                               
           D                    N                      V       

                         <20>              <20>      <20>               <20>

                         <20> num:singular <20>      <20> num:singular  <20>

                         <20>              <20>      <20> pers:3rd      <20>

                                               <20> tense:present <20>

                                               <20>               <20>

                                                               
          The                  dog                   barks.    


                        42


          Figure 3. An example of a unification-based grammar.


          [a]           S           -->      NP               VP

               <20>     <20>        <20> <20>     <20>          <20>  <20>     <20>        <20> <20>

               <20> sem:<3A> pred:X <20> <20>     <20> sem:Y    <20>  <20> sem:<3A> pred:X <20> <20>

               <20>     <20> arg1:Y <20> <20>     <20> case:nom <20>  <20>     <20> arg2:Z <20> <20>

               <20>     <20> arg2:Z <20> <20>     <20>          <20>  <20>     <20>        <20> <20>

               <20>     <20>        <20> <20>

             
          [b]            VP           -->      V             NP

               <20>     <20>         <20> <20>       <20>        <20>   <20>          <20>

               <20> sem:<3A> pred:X1 <20> <20>       <20> sem:X1 <20>   <20> sem:Y1   <20>

               <20>     <20> arg2:Y1 <20> <20>       <20>        <20>   <20> case:acc <20>

               <20>     <20>         <20> <20>                    <20>          <20>


          [c]       V       -->    sees

                [sem:SEES]


          [d]         NP      -->   Max

                [sem:MAX]


          [e]         NP      -->   Bill

                [sem:BILL]


          [f]       NP       -->   me

               <20>          <20>

               <20> sem:ME   <20>

               <20> case:acc <20>

               <20>          <20>


                        43


          Figure 4. Bottom-up parsing of Max sees Bill.


          a. Rules [c], [d], and [e] supply features for individual words:


                     NP               V                  NP

                 [sem:MAX]        [sem:SEES]         [sem:BILL]

                     <20>                <20>                  <20>

                    Max              sees               Bill


          b. Rule [b] allows V and NP to be grouped into a VP:


                                              VP

                                     <20>     <20>           <20> <20>

                                     <20> sem:<3A> pred:SEES <20> <20>

                                     <20>     <20> arg2:BILL <20> <20>

                                     <20>     <20>           <20> <20>

                                               <20> 


                                     <20>                  <20>

                    NP               V                  NP

               <20>         <20>      <20>          <20>       <20>          <20>

               <20> sem:MAX <20>      <20> sem:SEES <20>       <20> sem:BILL <20>

               <20>         <20>      <20>          <20>       <20> case:acc <20>

                     <20>                <20>            <20>          <20>

                     <20>                <20>                  <20>

                    Max              sees               Bill


                        44


          c. Rule [a] allows NP and VP to be grouped into an S:


                                    S

                          <20>     <20>           <20> <20>

                          <20> sem:<3A> pred:SEES <20> <20>

                          <20>     <20> arg1:MAX  <20> <20>

                          <20>     <20> arg2:BILL <20> <20>

                          <20>     <20>           <20> <20>

                                    <20>


                                              <20>

                                              VP

                                    <20>     <20>           <20> <20>

                                    <20> sem:<3A> pred:SEES <20> <20>

                                    <20>     <20> arg2:BILL <20> <20>

                                    <20>     <20>           <20> <20>

                                               <20> 


                     <20>                <20>                  <20>

                     NP               V                  NP

                <20>          <20>     <20>          <20>       <20>          <20>

                <20> sem:MAX  <20>     <20> sem:SEES <20>       <20> sem:BILL <20>

                <20> case:nom <20>     <20>          <20>       <20> case:acc <20>

                <20>          <20>          <20>             <20>          <20>

                     <20>                <20>                  <20>

                    Max              sees               Bill


                        45


          Figure 5. DAG representations of feature structures.


                                        .

                  
            <20>     <20>

            <20> a:b <20>

            <20> c:b <20>      =         a      c     e

            <20> e:d <20>

            <20>     <20>

                  
                                     b          d


                                                            .

                                     
          <20>     <20>             <20> <20>               syn                   sem

          <20> syn:<3A> case:acc    <20> <20> 

          <20>     <20> gender:masc <20> <20>  =

          <20>     <20>             <20> <20>

          <20> sem:MAN             <20>                 .

          <20>                     <20>

                             
                                           case       gender           MAN 


                                            acc       masc


                        46


          Figure  6. Parse tree for Max sees me. The ungrammatical sentence

          Me sees Max is ruled out by a feature conflict.


                                     S

                           <20>     <20>           <20> <20>

                           <20> sem:<3A> pred:SEES <20> <20>

                           <20>     <20> arg1:MAX  <20> <20>

                           <20>     <20> arg2:ME   <20> <20>

                           <20>     <20>           <20> <20>

                                     <20>


                                              <20>

                                               VP

                                      <20>     <20>           <20> <20>

                                      <20> sem:<3A> pred:SEES <20> <20>

                                      <20>     <20> arg2:ME   <20> <20>

                                      <20>     <20>           <20> <20>

                                                <20> 


                      <20>                <20>                  <20>

                      NP               V                  NP

                 <20>          <20>     <20>          <20>       <20>          <20>

                 <20> sem:MAX  <20>     <20> sem:SEES <20>       <20> sem:ME   <20>

                 <20> case:nom <20>     <20>          <20>       <20> case:acc <20>

                 <20>          <20>           <20>            <20>          <20>

                       <20>                <20>                  <20>

                      Max              sees                me


                        47


          Figure 7.


          % GULP example 1.

            % Grammar from Figure 3, in DCG notation, with GULP feature structures.


            s(sem: (pred:X .. arg1:Y .. arg2:Z)) -->  np(sem:Y .. case:nom),

                                                      vp(sem: (pred:X .. arg2:Z)).


            vp(sem: (pred:X1 .. arg2:Y1)) -->  v(sem:X1),

                                               np(sem:Y1).


            v(sem:'SEES')  --> [sees].


            np(sem:'MAX')  --> [max].


            np(sem:'BILL') --> [bill].


            np(sem:'ME' .. case:acc) --> [me].


            % Procedure to parse a sentence and display its features


            try(String) :- writeln([String]),

                           phrase(s(Features),String),

                           display_feature_structure(Features).


            % Example sentences


            test1 :- try([max,sees,bill]).

            test2 :- try([max,sees,me]).

            test3 :- try([me,sees,max]).  /* should fail */


                        48


          Figure 8.


          % Same as GULP example 1, but written in a much more PATR-like style,

            % treating the unifications as separate operations.


            s(Sfeatures) --> np(NPfeatures), vp(VPfeatures),

                             { Sfeatures = sem: (pred:X .. arg1:Y .. arg2:Z),

                               NPfeatures = sem:Y .. case:nom,

                               VPfeatures = sem: (pred:X .. arg2:Z) }.


            vp(VPfeatures) -->  v(Vfeatures), np(NPfeatures),

                                { VPfeatures = sem: (pred:X1 .. arg2:Y1),

                                  Vfeatures  = sem:X1,

                                  NPfeatures = sem:Y1 }.


            v(Features)   --> [sees], { Features = sem:'SEES' }.


            np(Features)  --> [max],  { Features = sem:'MAX' }.


            np(Features)  --> [bill], { Features = sem:'BILL' }.


            np(Features)  --> [me],   { Features = sem:'ME' .. case:acc }.


            % Procedure to parse a sentence and display its features


            try(String) :- writeln([String]),

                           s(Features,String,[]),

                           display_feature_structure(Features).


            % Example sentences


            test1 :- try([max,sees,bill]).

            test2 :- try([max,sees,me]).

            test3 :- try([me,sees,max]).  /* should fail */


                        49


          Figure 9.


          % Demonstration of a hold stack that

            % picks up the word 'what' at beginning of

            % sentence, and carries it along until an

            % empty NP position is found


            % S may or may not begin with 'what did'.

            % In the latter case 'what' is added to the stack

            % before the NP and VP are parsed.


            s(S) --> np(NP), vp(VP), 

                           { S  = hold: (in:H1..out:H3),

                             NP = hold: (in:H1..out:H2),

                             VP = hold: (in:H2..out:H3) }.


            s(S) --> [what,did], np(NP), vp(VP), 

                           { S  = hold: (in:H1..out:H3),

                             NP = hold: (in:[what|H1]..out:H2),

                             VP = hold: (in:H2..out:H3) }.


            % NP is parsed by either accepting det and n,

            % leaving the hold stack unchanged, or else

            % by extracting 'what' from the stack without

            % accepting anything from the input string.


            np(NP) --> det, n, { NP = hold: (in:H..out:H) }.


            np(NP) --> [], { NP = hold: (in:[what|H1]..out:H1) }.


            % VP consists of V followed by NP or S.

            % Both hold:in and hold:out are the same

            % on the VP as on the S or NP, since the

            % hold stack can only be altered while

            % processing the S or NP, not the verb.


            vp(VP) --> v, np(NP), { VP = hold:H,

                                    NP = hold:H }.


            vp(VP) --> v, s(S), { VP = hold:H,

                                  S  = hold:H }.


            % Lexicon


            det --> [the];[a];[an].

            n   --> [dog];[cat];[boy].

            v   --> [said];[say];[chase];[chased].


            try(X) :- writeln([X]), 

                      S = hold: (in:[]..out:[]), 


                        50


                    phrase(s(S),X,[]).


            test1 :- try([the,boy,said,the,dog,chased,the,cat]).

            test2 :- try([what,did,the,boy,say,chased,the,cat]).

            test3 :- try([what,did,the,boy,say,the,cat,chased]).

            test4 :- try([what,did,the,boy,say,the,dog,chased,the,cat]). 

                        /* test4 should fail */


                        51


          Figure 10.


          % Discourse Representation Theory

            % (part of the program from Johnson & Klein 1986,

            % translated from PrAtt into GULP).


            % unique_integer(N) 

            %          instantiates N to a different integer

            %          every time it is called, thereby generating

            %          unique indices.


            unique_integer(N) :- 

                        retract(unique_aux(N)),

                        !,

                        NewN is N+1,

                        asserta(unique_aux(NewN)).


            unique_aux(0).


            % Nouns

            %      Each noun generates a unique index and inserts

            %      it, along with a condition, into the DRS that

            %      is passed to it.


            n(N) --> [man],

                          { unique_integer(C),

                            N = syn:index:C ..

                            sem: (in:  [Current|Super] ..

                                  out: [[C,man(C)|Current]|Super]) }.


            n(N) --> [donkey],

                          { unique_integer(C),

                            N = syn:index:C ..

                            sem: (in:  [Current|Super] ..

                                  out: [[C,donkey(C)|Current]|Super]) }.


            % Verbs

            %      Each verb is linked to the indices of its arguments

            %      through syntactic features. Using these indices,

            %      it adds the appropriate predicate to the semantics.


            v(V) --> [saw],

                      { V = syn: (arg1:Arg1 .. arg2:Arg2) ..

                            sem: (in:  [Current|Super] ..

                                  out: [[saw(Arg1,Arg2)|Current]|Super]) }.


                        52


          % Determiners

            %      Determiners tie together the semantics of their

            %      scope and restrictor. The simplest determiner,

            %      'a', simply passes semantic material to its

            %      restrictor and then to its scope. A more complex

            %      determiner such as 'every' passes an empty list

            %      to its scope and restrictor, collects whatever

            %      semantic material they add, and then arranges

            %      it into an if-then structure.


            det(Det) --> [a],

                      { Det = sem:res:in:A,   Det = sem:in:A,

                        Det = sem:scope:in:B, Det = sem:res:out:B,

                        Det = sem:out:C,      Det = sem:scope:out:C }.


            det(Det) --> [every],

                      { Det = sem:res:in:[[]|A],    Det = sem:in:A,

                        Det = sem:scope:in:[[]|B],  Det = sem:res:out:B,

                        Det = sem:scope:out:[Scope,Res|[Current|Super]],

                        Det = sem:out:[[Res>Scope|Current]|Super] }.


            % Noun phrase

            %      Pass semantic material to the determiner, which

            %      will specify the logical structure.


            np(NP) --> { NP=sem:A,      Det=sem:A,

                         Det=sem:res:B, N=sem:B,

                         NP=syn:C,      N=syn:C },  det(Det),n(N).


            % Verb phrase

            %      Pass semantic material to the embedded NP

            %      (the direct object).


            vp(VP) --> { VP = sem:A,          NP = sem:A,

                         NP = sem:scope:B,    V = sem:B,

                         VP = syn:arg2:C,     NP = syn:index:C,

                         VP = syn:D,          V = syn:D },    v(V), np(NP).


            % Sentence

            %      Pass semantic material to the subject NP.

            %      Pass VP semantics to the subject NP as its scope.


            s(S) --> { S = sem:A,         NP = sem:A,

                       S = syn:B,         VP = syn:B,

                       NP = sem:scope:C,  VP = sem:C,

                       VP = syn:arg1:D,   NP = syn:index:D }, np(NP), vp(VP).


            % Procedure to parse and display a sentence


                        53


          try(String) :-  write(String),nl,

                            Features = sem:in:[[]],   /* start w. empty structure */

                            phrase(s(Features),String),

                            Features = sem:out:SemOut,  /* extract what was built */

                            display_feature_structure(SemOut).


            % Example sentences


            test1 :- try([a,man,saw,a,donkey]).

            test2 :- try([a,donkey,saw,a,man]).

            test3 :- try([every,man,saw,a,donkey]).

            test4 :- try([every,man,saw,every,donkey]).


                        54


          Figure 11.


          % BUP in GULP:

            % Bottom-up parsing algorithm of Matsumoto et al. (1986)

            % with the grammar from Figure 3.


            % Goal-forming clause


            goal(G,Gf,S1,S3) :-

                    word(W,Wf,S1,S2),

                    NewGoal =.. [W,G,Wf,Gf,S2,S3],

                    call(NewGoal).


            % Terminal clauses for nonterminal symbols


            s(s,F,F,X,X).

            vp(vp,F,F,X,X).

            np(np,F,F,X,X).


            % Phrase-structure rules


            % np vp --> s


            np(G,NPf,Gf,S1,S3) :-  goal(vp,VPf,S1,S2),

                                   s(G,Sf,Gf,S2,S3),

                                   NPf = sem:Y..case:nom,

                                   VPf = sem: (pred:X..arg2:Z),

                                   Sf  = sem: (pred:X..arg1:Y..arg2:Z).


            % v np --> vp


            v(G,Vf,Gf,S1,S3)  :- goal(np,NPf,S1,S2),

                                 vp(G,VPf,Gf,S2,S3),

                                 Vf  = sem:X1,

                                 NPf = sem:Y1..case:acc,

                                 VPf = sem: (pred:X1..arg2:Y1).


            % Terminal symbols


            word(v,sem:'SEES',[sees|X],X).

            word(np,sem:'MAX',[max|X],X).

            word(np,sem:'BILL',[bill|X],X).

            word(np,sem:'ME'..case:acc,[me|X],X).


            % Procedure to parse a sentence and display its features


                        55


          try(String) :- writeln([String]),

                           goal(s,Features,String,[]),

                           display_feature_structure(Features).


            % Example sentences


            test1 :- try([max,sees,bill]).

            test2 :- try([max,sees,me]).

            test3 :- try([me,sees,max]).  /* should fail */