Computational structure of human language

words are rewritten as segmental strings in the interface between syntax and phonology (see chapter 3). The derivation of irregular and regular forms are identical from this perspective: both are simply the arbitrary rewriting of abstract morphological constituents into segmental strings. The first restriction on elementary rules, then, is to limit the class of arbitrary rewritings to the interface between phonology and morphology, and to ban the arbitrary rewriting of segmental strings from the phonology proper. Rules that delete, change, exchange, or insert segments-as well as rules that manipulate boundaries-are crucial to phonological theorizing, and therefore cannot be crudely constrained.9 More subtle and indirect restrictions are needed for these rules.10 One indirect restriction is to limit the possible interactions among rules. Because segmental grammars do not have a finite state control, all rule interactions must arise via the derivation form (ie., the sequence of segmental strings that is the computation string for the segmental derivation). The computationally significant interactions are ones that use the derivation form to store intermediate results of computations. The segmental model allows one rule to make a change in the derivation form, and a subsequent rule to make a change to this change, and so on. A segment that is inserted can subsequently be deleted; a segment that is switched with another segment can subsequently be switched with another segment, or deleted. 'One restriction proposed in the literature, is McCarthy's (1981:405) "morpheme rule constraint" (MRC), which requires all morphological rules to be of the form A -+ B/X where A is a unit or 4, and B and X are (possibly null) strings of units. (X is the immediate context of A, to the right or left.) The MRC does not constrain the computational complexity of segmental phonology, because individual rules can still insert and delete segments, and groups of rules can be coordinated to perform arbitrary rewriting. 1oThat Chomsky and Halle were well aware of these problems is beyond doubt: "A possible direction in which one might look for such an extension of the theory is suggested by certain other facts that are not handled with complete adequacy in the present theory. Consider first the manner in which the process of metathesis was treated in Chapter Eight, Section 5. As will be recalled, we were forced there to take advantage of powerful transformational machinery of the sort that is used in the syntax. This increase in the power of the formal devices of phonology did not seem fully justified since it was made only to handle a marginal type of phenomenon. An alternative way to achieve the same results is to introduce a special device which would be interpreted by the conventions on rule application as having the effect of permuting the sequential order of a pair of segments." (p.427) We have every reason to believe that such interactions are not natural." 1 The underlying form of a word must encode all the information needed to pronounce that word, as well as recognize it. This information must be readily accessible, in order to ease the task of speaking, as well as the acquisition of the underlying forms of new words. The underlying form of a given word is a representation that omits all the directly predictable information in the surface form. The methodological directive "omit predictable information" means that a feature or segment of a surface form must be omitted if it is directly predictable from the properties of the phonology as a whole (such as the structure of articulations or the segmental inventory), or from the properties of that particular surface form, such as its morpheme class, adjacent segments, or suprasegmental patterns. To a first approximation, "directly predictable" means "computable by one rule with unbounded context and no intermediate results." The immediate consequence is that underlying forms cannot contain more segments or features than their corresponding surface forms. It is also true that the derivation adds the predictable information to the underlying form in a nearly monotonic fashion. The next restriction is to propose a strictly monotonic segmental model, where no rule of the grammar may shorten the derivation form. Deletion phenomenon will be modelled using a diacritic that blocks the insertion of the "deleted" segment. The details of such a (nearly or strictly) monotonic model must of course be worked out. But it is promising, and if plausible, as it seems to be, then the simulations in footnote 8 would be excluded. This is one formal way to define the notion of "predictable information," which, based as it is in the fundamental notion of computationally accessible information, seems more coherent and fundamental than the notion of a "linguistically-signifcant generalization," which has proven elusive. "In point of fact, insertions and deletions do not interact in the systems proposed by phonologists. Units are inserted only when they appear in the surface form, and are totally predictable. Such units are never deleted. Since inserted units aren't be deleted, and since an underlying form is proportional to the size of its surface form, the derivation can only perform a limited number of deletions, bounded by the size of the underlying form. In general, deletions typically only occur at boundaries, in order to "fix-up" the boundary between two morphemes. Because underlying forms cannot consist solely of boundaries, we would expect the size of an underlying form to be proportional to the sise of its surface realization. 2.8.3 The SPE evaluation metric The SPE evaluation metric is a proposal to define the notion of a natural rule and linguistically-significant generalization. At first glance, this proposal seems vacuous. In order to minimize the number of symbols in the grammar, the all observed surface forms should simply be stored in dictionary of underlying forms. Then the number of symbols in the grammar is zero, and all the linguistically significant generalizations in the corpus have been discovered, that is, none. Clearly, this is not what Chomsky and Halle intended. Perhaps the size of the dictionary must be included in the metric as well. Now the most natural phonology is the smallest grammar--dictionary whose output is consistent with the observed corpus. The solution to this problem is also trivial: the optimal grammar--dictionary simply generates E*. So the simplest coherent revision of the SPE metric states the most natural phonological system is the smallest grammar-dictionary that generates exactly the finite set of observed forms. Ignoring questions of feasibility (that is, how to find such a system), we run into serious empirical problems, because the observed corpus is always finite. The smallest grammar will always take advantage of this finiteness, by discovering patterns not yet falsified by the set of observed surface forms. The underlying forms in such an optimal grammar-dictionary system will in fact look nothing like the true underlying forms, ie., those postulated by phonologists on the basis of evidence that is not available to the language acquisition device (LAD). And even if the set of underlying forms is fixed, the optimal grammar in such a system will still not be natural, failing standard empirical tests, such as those posed by loan words and language change. 12 This observation is confirmed by the complexity proofs. An important corollary to lemma 2.3.1 is that segmental grammars form a universal basis for computation. For example, it is possible to directly simulate an arbitrary Post tag system using a very simple set of phonological rules. Or we can "In my brief experience as a phonologist, the most natural grammars did not have the smallest number of symbols, even when the proper morphemic decomposition of underlying forms was known in advance. With enough time and mental discipline, it was always possible to construct a smaller grammar than the "correct" one, by taking advantage of "unnatural" patterns in the observed surface forms. Increasing the number of examples does not help, simply because there will never be enough examples to exclude all the computable but unnatural patterns. simulate the four-symbol seven-state "smallest universal Turing machine" of Minsky (1967) in the segmental model; the resulting grammar contains no more than three features, eight specifications, and 36 trivial rules. These segmental grammars of universal computation contain significantly fewer symbols than a segmental grammar for any natural language. And this is not even the best that can be done. The smallest combined grammardictionary for the set of all observed words will be even smaller, because it can take advantage of all computable generalizations among the finite set of observed surface forms, not only the linguistically significant ones. In fact, the attached dictionary would represent the Kolmogorov complexity of the observed surface forms with respect to the optimal segmental grammar, ie., the true information content of the observed surface forms with respect to an arbitrarily powerful encoder. Therefore, this corollary presents severe conceptual and empirical problems for the segmental theory. In short, even if we ignore questions of feasibility, the smallest segmental grammar-dictionary capable of enumerating the set of observed surface forms cannot be natural because it must discover too many unnatural generalizations. How then can we make sense of the SPE evaluation metric? The evaluation metric makes certain sets of disjunctively ordered elementary rules as natural as an elementary rule. The fundamental difference between a complex rule and an elementary rule is that a complex rule is capable of performing nonlocal Post-style rewriting, whereas elementary rules are limited to local Thue-style rewriting. Therefore, the SPE evaluation metric formalizes the obs

[1]  John J. McCarthy,et al.  A prosodic theory of nonconcatenative morphology , 1981 .

[2]  Samuel Jay Keyser,et al.  CV Phonology: A Generative Theory of the Syllable , 1988 .

[3]  A. Turing On Computable Numbers, with an Application to the Entscheidungsproblem. , 1937 .

[4]  Noam Chomsky,et al.  The Sound Pattern of English , 1968 .

[5]  Howard Lasnik,et al.  On the nature of proper government , 1990 .

[6]  Richard S. Kayne Connectedness and binary branching , 1984 .

[7]  Joachim Lambek,et al.  On the Calculus of Syntactic Types , 1961 .

[8]  Geoffrey K. Pullum,et al.  Generalized Phrase Structure Grammar , 1985 .

[9]  G. M. Horn,et al.  On ‘On binding’ , 1981 .

[10]  Dominique Sportiche,et al.  A note on long extraction in Vata and the ECP , 1986 .

[11]  Mark C. Baker,et al.  Incorporation: A Theory of Grammatical Function Changing , 1988 .

[12]  Esther Torrego Salcedo On inversion in Spanish and some of its effects , 1984 .

[13]  Andrew Chi-Chih Yao,et al.  Computational information theory , 1988 .

[14]  Noam Chomsky Some notes on economy of derivation and representation , 2013 .

[15]  Noam Chomsky,et al.  वाक्यविन्यास का सैद्धान्तिक पक्ष = Aspects of the theory of syntax , 1965 .

[16]  Tim Stowell,et al.  Origins of phrase structure , 1981 .

[17]  Isabelle Haik,et al.  Bound VPs that need to be , 1987 .

[18]  Eric Sven Ristad,et al.  Computational Complexity of Current GPSG Theory , 1986, ACL.

[19]  Martin Plátek,et al.  A Scale of Context Sensitive Languages: Applications to Natural Language , 1978, Inf. Control..

[20]  Bradley L. Pritchett Garden Path Phenomena and the Grammatical Basis of Language Processing , 1988 .

[21]  T. Reinhart Anaphora and semantic interpretation , 1983 .

[22]  M. Halle Problem Book In Phonology , 1983 .

[23]  J. Heath The languages of kinship in Aboriginal Australia , 1982 .

[24]  Ivan A. Sag,et al.  Deletion And Logical Form , 1976 .

[25]  Howard Lasnik,et al.  On The Necessity of Binding Conditions 1986 , 1989 .

[26]  Noam Chomsky,et al.  Lectures on Government and Binding , 1981 .

[27]  P. Stanley Peters,et al.  On the generative power of transformational grammars , 1973, Inf. Sci..

[28]  R. Duncan Luce,et al.  Readings in mathematical psychology , 1963 .

[29]  David Pesetsky,et al.  Paths and categories , 1982 .

[30]  S. Kuroda Whether We Agree or Not: A Comparative Syntax of English and Japanese , 1988 .

[31]  T. Wasow Anaphoric relations in English , 1972 .

[32]  A. Luria,et al.  An Extraordinary Gift. (Book Reviews: The Mind of a Mnemonist. A Little Book about a Vast Memory) , 1968 .

[33]  Haskell B. Curry,et al.  Some logical aspects of grammatical structure , 1961 .

[34]  W. B. Cameron The Mind of a Mnemonist: A Little Book about a Vast Memory , 1970 .

[35]  Jeffrey D. Ullman,et al.  Introduction to Automata Theory, Languages and Computation , 1979 .

[36]  R. May Logical Form: Its Structure and Derivation , 1985 .

[37]  J. Higginbotham Reference and Control , 1992 .

[38]  E. Keenan Names, quantifiers, and the sloppy identity problem , 1971 .

[39]  D. Finer The formal grammar of switch-reference , 1984 .

[40]  Eric Sven Ristad,et al.  Complexity of Human Language Comprehension , 1988 .

[41]  B. Partee Montague Grammar and Transformational Grammar. , 1975 .

[42]  Noam Chomsky,et al.  The Logical Structure of Linguistic Theory , 1975 .

[43]  Jean-Yves Pollock Verb movement, universal grammar and the structure of IP , 1989 .

[44]  Dominique Sportiche A theory of floating quantifiers and its corollaries for constituent structure , 1988 .

[45]  Uriel Weinreich,et al.  On semantics , 1980 .

[46]  C. Douglas Johnson,et al.  Formal Aspects of Phonological Description , 1972 .

[47]  Robert C. Berwick,et al.  Computational Consequences of Agreement and Ambiguity in Natural Language , 1988 .

[48]  Elizabeth Caroline Sagey,et al.  The representation of features and relations in non-linear phonology , 1986 .

[49]  Noam Chomsky Some Concepts and Consequences of the Theory of Government and Binding , 1982 .

[50]  Yoshihisa Kitagawa Copying identity , 1991 .

[51]  J. Rissanen,et al.  Modeling By Shortest Data Description* , 1978, Autom..

[52]  Alec Marantz Projection vs. percolation in the syntax of synthetic compounds , 1989 .

[53]  Naoki Fukui A theory of category projection and its applications , 1990 .

[54]  Beth Levin,et al.  -er Nominals: Implications for the Theory of Argument Structure , 1992 .