Harnad, Stevan (2008) 'Why and How the Problem of the Evolution of Universal Grammar (UG) is Hard' Behavioral and Brain Sciences (forthcoming); Commentary on Christiansen, Morten H. and Chater, Nick (2008) "Language as Shaped by the Brain" Behavioral and Brain Sciences (forthcoming)  http://www.bbsonline.org/Preprints/Christiansen-12292006/

[Note: This is the long version. The published version is shorter.]

Why and How the Problem of the Evolution of Universal Grammar (UG) is Hard


Stevan Harnad

Chaire de recherche du Canada

Institut des sciences cognitives

Universite du Quebec a Montreal

Montreal, Quebec,  Canada  H3C 3P8



Department of Electronics and Computer Science

University of Southampton

Highfield, Southampton




Abstract: Universal Grammar (UG) is a complicated set of grammatical rules that underlies our grammatical capacity. We all follow the rules of UG, but we were never taught them, and we could not have learned them from trial and error experience either (not enough data, or time). So UG must be inborn. But for similar reasons, it seems implausible that UG was 'learned' by trial and error evolution either: What was the variation and competition? And what were UG's adaptive advantages? So this leaves the hard problem of explaining where our brain's UG capacity came from. Christiansen & Chater (C&C) suggest an answer: Language is an organism, like us, and our brains were not selected for UG capacity; rather, languages were selected for learnability with minimal trial and error experience by our brains. This explanation is circular: Where did our brains' selective capacity to learn all and only UG-compliant languages come from? Chomsky suggests it might be a combination of optimality and logical necessity.



It might be a good idea  to remind ourselves exactly why and how the problem of the evolutionary origins of Universal Grammar (UG) is hard (Harnad 1976) -- and hence perhaps not quite as readily solvable as Christiansen & Chater (C&C) suggest it might be.


Universal Grammar (UG). First, what is UG? It is a surprisingly complicated and abstract set of grammatical rules: not the grammar rules you learned in school (or figured out from hearing and reading), but rules that no one even knew existed until Noam Chomsky discovered them. And UG remains a set of rules that most of us (including me!) still don't know to this day -- don't know explicitly, that is, in the sense that we were never taught them, we are not aware them, we cannot put them into words, and we would not recognize them (or even understand them, without considerable technical training) if they were explicitly told to us in words (and symbols) by a professional grammarian.


Implicit Knowledge. Yet we all in a sense do 'know' all those rules of UG "implicitly," because they are the rules that make us capable of speaking grammatically at all -- able to produce all and only the sentences that are grammatically well-formed, according to UG, and able to recognize and reject all the sentences that are grammatically ill-formed according to UG. It's rather as if we all knew implicitly how to play chess -- we could make all and only the legal moves -- yet we had no explicit idea what rules we were following.


Learning. The reason it makes little sense to imagine being able to play chess without being able to say what rules we are following, and without somehow having learned them from experience -- either by being told them explicitly, or by figuring them out through a combination of watching and imitating others play and ourselves playing by trial and error, with our wrong moves corrected by those who know -- is that the rules of chess are simple, we all learned them either one way or the other (if we know how to play chess at all), and we can all perfectly well verbalize them, or at least recognize them if someone else verbalizes them.


Not so for UG. UG's rules are abstract, complex and technical. Since Chomsky first discovered their existence, linguists have gradually been figuring them out through decades of careful analysis, through hypothesis, trial, and error, based on consulting the grammatical intuitions we all share about what can and cannot be said, and then trying to construct for those a set of rules that will allow all and only the sentences we all immediately recognize as well-formed and disallow all those we recognize as ill-formed. That set of rules -- not yet complete even today, but already able to explain a good-sized chunk of our grammatical capacity -- is UG, and it turned out to have some surprising properties:


Universality. First, UG turned out to be universal: All languages have turned out to obey the very same set of rules. The allowable grammatical differences between languages are all there in UG too, as available "parameter settings" on the set of rules. When children learn a particular language, they learn how to adjust the parameters on the rules of UG to configure them for that particular language. But the most surprising thing of all was that children do not learn the rules of UG itself.


Unlearnability. Children cannot learn the rules of UG because -- unlike with chess -- the rules of UG are too complicated and abstract to learn by observation and trial and error on the basis of the information available to the language-learning child. (It took UG linguists many years to first 'learn' them from data by trial and error – many more years and much more data than those available to the language-learning child.) And, as noted, those rules are not taught or learned by explicit instruction either -- because, before Chomsky and the field of UG linguistics he created, no one even knew the rules, let alone taught them, even though our species had been speaking language for a hundred thousand years.


'Poverty of the Stimulus'. We have now reached the point where I can state exactly why the problem of the evolution of UG is so hard, and why C&C's solution is too weak to solve it: The reason the child does not and cannot learn the rules of UG by observation, trial and error, and error-correction (let's forget about instruction, because, as noted, before Chomsky no one even knew what the rules of UG were, let alone tried to teach them to a child) is that the data on the basis of which the rules of UG would have to be learned by the child do not contain anywhere near enough of the information that the child (or any learning system at all) would need to have in order to be able to infer the rules from them. (This is called the 'poverty of the stimulus' or the computational 'underdetermination' of the rules of UG by the database from which it would have to be learned, if it was learned at all.)


Error-Correction. To put it very simply: In order to be learned at all, the rules of UG would have to be learnable through trial and error, with error-correction -- exactly as chess-rules have to be, in order to be learnable without explicit instruction: I try to move my bishop in a certain way, and you tell me, no, that's not a legal move, this is, and so on. Well, in a nutshell, children cannot learn the rules of UG that way because they basically never make (or hear) any UG errors ('wrong moves'); hence children never get or hear any UG error-corrections.


It is not that children speak flawlessly from birth. We all know they cannot do that. But the observation, imitation, and error correction that the child does experience during the relatively brief period of transition from being unable to speak to being able to speak does not involve any errors (or error-corrections) in the rules of UG, either from the child or from the speakers that the child hears. There are grammatical errors and corrections aplenty, to be sure, but they are corrections pertaining to the old-fashioned grammatical rules that we all know or can know explicitly, not the complex, abstract, implicit rules of UG. Those UG rules are never violated by the child, nor by anyone the child ever hears (unless its parent is a Chomskian linguist, working aloud at home!).


At first the child cannot speak at all. Then it begins producing agrammatical or grammatically simple utterances alongside its rote imitations. And then it is speaking perfectly UG-compliantly. Insofar as the rules of UG are concerned, the child has learned only the parameter settings. The rules themselves were never broken, never corrected, hence never "learned": Therefore they must already have been inborn.


Evolutionary Trial and Error. Having made it explicit exactly why the UG problem is hard, I now turn to why and how I think C&C fail to solve it: The problem of the origin of UG is hard for the grammar-learning theorist because, owing to the poverty of the stimulus, UG is unlearnable by the child. The provisional solution there is to conclude that the child must therefore be born with the rules of UG already encoded in its brain. But that just raises the further problem of the evolutionary origin of those inborn, genetically coded rules. That is in fact an even bigger problem, because in a sense evolution faces the same learning problem the child does. Evolution has more time available than the child, but it has an even more impoverished database: It is not at all clear what would serve as error-correction, and what would count as right and wrong, in order to shape UG in the usual Darwinian way: through trial and error genetic variation, and adaptive selection on the basis of advantages in survival and reproduction.


The Adaptive Advantage of UG? In the case of the evolution of other biological structures, such as fins, wings or eyes, or the evolution of biological functions such as the capacity to see, learn, or reason, there is no problem in principle for the usual kind of evolutionary trial-and-error explanation, even in the cases where the adaptive explanation has not yet been fully worked out in practice. But with UG there is a deep problem in principle. The problem arises not merely because of UG's complexity (for many organs are complex, and evolution, unlike the language-learning child, has a lot of time available to 'shape' them through trial and error variation and selection). The hard problem arises because UG has no apparent adaptive advantages. For although a professional grammarian's lifetime is long enough to work out most of UG's rules explicitly by trial and error induction, it turns out that (with the possible exception of a few small portions of UG) no logical or practical advantage has yet been discerned that favors what UG allows over what it disallows, or over an altogether different set of grammatical rules (perhaps even a much simpler and learnable set). The absence of a biological advantage for UG is an even greater handicap than the poverty of the stimulus. It means that even with all of evolutionary time at its disposal, there is no ordinary evolutionary explanation for how or why UG would have been selected  (if its basis is the usual genetic variation, selectively propagated through the survival/reproduction advantages it confers on its bearers).


The Circularity of C&C's  Co-Evolutionary Hypothesis. C&C rightly express skepticism about alternative 'piggy-back' theories of the evolutionary origin of UG, because there is simply no credible precursor structure or function, one that had a separate prior, plausible adaptive advantage of its own, for some other biological purpose, that could then have been co-opted to do the duties of UG as well: Nothing homologous to the complex and abstract formal rules of UG exists in brain or behavior. But C&C's alternative proposal is no more convincing: They say that language, too, is an 'organism,' like people and animals, that it too varies across generations, historically, and that the shape that language took was selectively determined by the shape the brain already had, in that only the languages that were learnable by our brains successfully 'survived and reproduced.'


The trouble with this hypothesis is that it is circular: We were looking for the evolutionary origin of the complex and abstract rules of UG. C&C say (based on their computer simulations of far simpler rule systems, not bound by the poverty of the stimulus): Don't ask how the UG rules evolved in the brain. The rules are in language, which is another organism, not the brain. The brain simply helped shape the language, in that the variant languages that were not learnable by the brain simply did not 'survive.'


This hypothesis begs the question of why and how the brain acquired an evolved capacity to learn all and only UG-compliant languages in the first place, despite the poverty of the stimulus – which was the hard problem we started out with in the first place! It would be like saying that the reason we are born already knowing the rules of chess without ever having to learn them by trial and error is that, in our evolutionary past, there was variation in the games (likewise 'organisms') that we organisms tried to play, and only those games that we could play without having to learn them by trial and error survived! (That still would not even begin to explain what it is about our brains that makes them able to play chess without trial and error!)


The Adaptive Advantage of Language. This circularity is partly a result of a vagueness about what exactly is the target of language evolution theory. Pinker & Bloom (1990) had already begun the misleading practice of freely conflating evolutionarily unproblematic questions (such as the origins of phonology, learnable aspects of grammar, vocabulary, 'parity') with the one hard problem of the origins of UG, which specifically concerns the evolutionary origins of complex rules that are unlearnable because of the poverty of the stimulus. Language, after all, is not just grammar, let alone just UG. If, on the one hand, the adaptive value of language itself (Cangelosi & Harnad 2001; Harnad 2005, 2007) could have been achieved with a much simpler grammar then UG (perhaps even a learnable one), then the evolutionary origin and adaptive function of UG becomes all the harder to explain, with C&C's historical variation in the language 'organism' occurring far too late in the day to be of any help. If, on the other hand, the adaptive advantages of language were impossible without UG, then we are still left with the hard problem of explaining how and why not.


UG As A Necessity for Thought? Chomsky (2005) himself has suggested that UG may be a necessary (i.e., Platonic) property of being able to think at all: A fundamental computational capacity in the form of a single (implicit) formal operation called 'unbounded Merge,' carried by a single mutation a hundred thousand years ago, conferred on our species all the power and adaptive advantages of thought -- and Merge carried UG with it as a necessary constraint, much the way the power to add carries with it the necessary constraint that 2+2=4 rather than 5.


Chomsky has been right on so much else, that this possibility definitely needs to be taken seriously. But to solve the hard problem, the Merge-mutation theory will need to explain exactly how UG is a matter of logical or functional necessity in order to be able to think at all.




Cangelosi, A. & Harnad, S. (2001) The Adaptive Advantage of Symbolic Theft Over Sensorimotor Toil:Grounding Language in Perceptual Categories. Evolution of Communication 4(1) 117-142 http://cogprints.org/2036/


Chomsky, N. (2005): Some Simple Evo-Devo Theses: How True Might They Be For Language? Alice V. and David H. Morris Symposium on Language and Communication; The Evolution of Language. Stony Brook University, New York, USA (October 14 2005)



Harnad, Stevan (1976) Induction, evolution and accountability, In: Origins and Evolution of Language and Speech (Harnad, Stevan, Steklis , Horst Dieter and Lancaster, Jane B., Eds.), 58-60. Annals of the New York Academy of Sciences. http://cogprints.org/0863


Harnad, S. (2005) To Cognize is to Categorize: Cognition is Categorization, in Lefebvre, C. and Cohen, H., Eds. Handbook of Categorization. Elsevier.  http://eprints.ecs.soton.ac.uk/11725/


Harnad, S. (2007) From Knowing How To Knowing That: Acquiring Categories By Word of Mouth. Presented at Kaziemierz Naturalized Epistemology Workshop (KNEW), Kaziemierz, Poland, 2 September 2007. http://eprints.ecs.soton.ac.uk/14517/


Pinker, S. & Bloom, P. (1990) Natural language and natural selection. Brain and Behavioral Sciences 13:707–27. http://www.bbsonline.org/Preprints/OldArchive/bbs.pinker.html