Head-driven phrase structure grammar

Head-driven phrase structure grammar (HPSG) is a highly lexicalized, non-derivational generative grammar theory developed by Carl Pollard and Ivan Sag (1985). It is the immediate successor to generalized phrase structure grammar. HPSG draws from other fields such as computer science (data type theory and knowledge representation) and uses Ferdinand de Saussure's notion of the sign. It uses a uniform formalism and is organized in a modular way which makes it attractive for natural language processing.

An HPSG grammar includes principles and grammar rules and lexicon entries which are normally not considered to belong to a grammar. The formalism is based on lexicalism. This means that the lexicon is more than just a list of entries; it is in itself richly structured. Individual entries are marked with types. Types form a hierarchy.

The basic type HPSG deals with is the sign. Words and phrases are two different subtypes of sign. A word has two features: [PHON] (the sound, the phonetic form) and [SYNSEM] (the syntactic and semantic information), both of which are split into subfeatures. Signs and rules are formalized as typed feature structures.

A Sample Grammar
HPSG generates strings by combining signs, which are defined by their location within a type hierarchy and by their internal feature structure, represented by attribute value matrices (AVMs). Features take types or lists of types as their values, and these values may in turn have their own feature structure. Grammatical rules are largely expressed through the constraints signs place on one another. A sign’s feature structure describes its phonological, syntactic, and semantic properties. In common notation, AVMs are written with features in upper case and types in italicized lower case. Numbered indices in an AVM represent token identical values.

In the simplified AVM for the word “walks” below, the verb’s categorical information is divided into features that describe it (HEAD) and features that describe its arguments (VALENCE).



“Walks” is a sign of type word with a head of type verb. As an intransitive verb, “walks” has no complement but requires a subject that is a third person singular noun. The semantic value of the subject (CONTENT) is co-indexed with the verb’s only argument (the individual doing the walking). The following AVM for “she” represents a sign with a SYNSEM value that could fulfill those requirements.



Signs of type phrase unify with one or more daughters and propagate information upward. The following AVM is for a head-subj-phrase that requires two daughters: the head daughter (a verb) and a non-head daughter that fulfills the verb’s SUBJ constraints.



The end result is a sign with a verb head, empty subcategorization features, and a phonological value that orders the two daughters.

Although the actual grammar of HPSG is composed entirely of feature structures, linguists often use trees to represent the unification of signs where the equivalent AVM would be unwieldy.



Implementations
Various parsers based on the HPSG formalism have been written and optimizations are currently being investigated. An example of a system analyzing German sentences is provided by the Freie Universität Berlin. In addition the Grammar Group of the Freie Universität Berlin provides open source grammars that were implemented in the TRALE system. Currently there are grammars for German, Mandarin Chinese, Maltese , and Persian that share a common core and are publicly available. For Dutch, the wide-coverage dependency parser Alpino has been developed at the University of Groningen.

Large HPSG grammars of various languages are being developed in the DELPH-IN collaboration network. Wide-coverage grammars of German, English and Japanese are available under an open-source license.