Abstract

What distinguishes the cognition of biologically modern humans from that of more archaic populations such as Neandertals? The norm in paleoanthropology has been to emphasize the role of language and symbolism. But the modern mind is more than just an archaic mind enhanced by symbol use. It also possesses an important problem solving and planning component. In cognitive neuroscience these advanced planning abilities have been extensively investigated through a formal model known as working memory. The working memory model is now well-enough established to provide a powerful lens through which paleoanthropologists can view the fossil and archaeological records. The challenge is methodological. The following essay reviews the controversial hypothesis that a recent enhancement of working memory capacity was the final piece in the evolution of modern cognition.

1. Introduction

Ever since the publication of The Human Revolution twenty years ago [1], many of the most exciting and contentious debates in paleoanthropology have revolved around the evolution of modern humans. The research and discussion have focused on two intertwined goals: (1) understanding the evolutionary transition from archaic to modern anatomy and behavior, and (2) tracing the specific evolutionary scenario by which the transition took place. In pursuing the first goal, paleoanthropologists strive to identify the derived characteristics that distinguish modern anatomy and behavior from those of more archaic humans, and propose selective hypotheses to account for them. This goal is in keeping with overall desire of paleoanthropologists to understand the entire sweep of human evolution. The second goal is the more difficult, and certainly the more contentious. Many paleoanthropologists would like to trace the actual sequence of events that led to modernity: who made the transition? Where did it happen? When did it happen? And, perhaps most contentious of all, what happened to archaic populations such as Neandertals? Archaeologists have been intimately involved in the pursuit of both goals. They have tried to identify derived features of modern behavior, as preserved in the archaeological record, and have proposed hypotheses to account for the emergence of modern behavior [25]. But they have also attempted to trace the specific evolutionary scenario through use of the same archaeological remains [69]. As with the fossil remains, the second goal has focused primarily on the fate of a single population from southern or eastern Africa, and the consequences that ensued for archaic groups, especially Neandertals.

Our interest is primarily in goal #1. In particular, we are interested in documenting the emergence of modern cognition, relying as much as possible on macroevolutionary evidence supplied by archaeology. Until recently, archaeological treatments of the cognitive components of modernity have been dominated by variously developed assertions concerning language and symbolic culture. This focus on symbolism resulted from two quirks in the history of palaeolithic archaeology. The first is an accident of archaeological discovery. The first palaeolithic culture attributed to early modern humans was the European Upper Palaeolithic (ca. 40,000–14,000 years ago), and the most spectacular component of it was its art—cave paintings, figurines, and personal ornaments. The Neandertals who preceded the Upper Palaeolithic in Europe had little or none of this, and the contrast was stark. The presence of art implied the presence of meaning, and modern symbolic sensibilities, or so the reasoning went. Thus, in Europe modern anatomy (Cro-Magnon) and modern symbolism came to be linked, a scenario that was then applied with little criticism to other continents. The second quirk is linked to the history of archaeological method and theory. In the 1970s, palaeolithic archaeologists began to take social science seriously as a source of interpretive models. At that time social science had a strong structuralist orientation, and Chomsky’s hypotheses about language were paramount. As a result, many archaeologists came to consider language to be the sine qua non of humanness. Indeed, this opinion was so generally held that few ventured to question its appropriateness when the “modern question” came to dominate the attention of paleoanthropology in the 1990s. Several influential summaries attributed modern behavior to the emergence of modern language, on the assumption that enhanced communication ability would have had obvious evolutionary advantages [1, 5]. A few archaeologists took a more critical stance. Davidson and Noble, for example, carefully parsed language into its grammatical and symbolic components, and, based on the work of Wittgenstein and James Gibson, argued that true symbolism was a late emerging faculty [10, 11].

Symbolism and language are certainly components of modern thinking, and documenting their evolution is important. However, alone they are insufficient to account for all of the features of the modern mind. The modern mind is not, we contend, simply an archaic mind augmented by symbolism and language. There are arguably several other components, including problem solving and long range planning abilities. A few archaeologists have appreciated this—most notably Mithen [12], Ambrose [13], and Davidson and Noble [14]. The latter, for example, argued that the colonization of Australia required planning abilities only possible via modern language—so we are not alone in this contention. Our approach differs from theirs in its theoretical and methodological bases. For the last decade, we have taken a specific cognitive model, that of executive functions and working memory, and used it to interpret archaeological remains. The result has been a different and controversial picture of the emergence of one of the components of modern thinking [1518].

2. Executive Functions and Working Memory

As the label implies, executive functions encompass the brain’s ability to plan and strategize. The term and concept were first developed by neuropsychologists working with brain damaged adults, initially Russo-Japanese War soldiers with head wounds. Neuropsychologists such as Luria [19] described patients with damage to the frontal lobes who retained full language faculties but who were unable to carry out complex, purposive, goal-directed actions. They were unable to evaluate the success or failure of their own actions, and unable to alter their future behavior. Neural activity centered in the frontal lobes was clearly very important for higher level thinking. Executive functions of the frontal lobes were:

“…at the heart of all socially useful, personally enhancing, constructive, and creative abilities…. Impairment or loss of these functions compromises a person’s capacity to maintain an independent, constructively self-serving, and socially productive life no matter how well he can see, hear, walk and talk, and perform tests” [20], p. 281.

By the 1960s executive functions had become a well-established cognitive array in the field of neuropsychology, and neuropsychologists had devised a number of assessment tools that could be used to evaluate brain damaged individuals (e.g., the Tower of London test). Adults with executive function (EF) deficits have a difficult time functioning on their own and often have poor social interaction skills (e.g., interacting with sales people to purchase an item). The significant role of EFs in social life led Barkley [21] to suggest that the evolution of EFs occurred via selection for effective social cognition.

Coolidge and Wynn [15] initially proposed that EFs, as defined primarily by neuropsychologists, were a better candidate than language for the neural development that led to modern thinking. It soon became clear to them, however, that a well-developed model in cognitive psychology encompassed many of the same cognitive phenomena as EFs, but had the advantage of being more explicit in terms of specific abilities, and also being based on experimental research with normal, that is, not brain-damaged, individuals. This model is working memory.

3. The Working Memory Model

The concept of working memory (WM) was first developed by Baddeley and Hitch in 1974 [22] as an elaboration of the older concept of short-term memory. At its most basic, WM is the mind’s ability to hold in attention, and process, task-relevant information in the face of interference. As an example we ask the reader to perform the following task: as you read this paragraph, remember the final word in each of the sentences and after completing the paragraph, recite these terminal words in order, from memory. This is a test of working memory capacity. The more words you can remember, the greater your WM capacity. Note that there are two components to this task. First, you must remember a sequence of words. This is a classic short-term memory test, and in itself is not too taxing. But in this context you must do it while reading, which interferes with remembering the words. The WM model has been perhaps the single most researched and successful cognitive model of the last forty years. It has integrated and synthesized research results from several allied fields such as psychology, neurology, and neuropsychology. Even more important, psychometric measures of WM capacity correlate with a wide variety of critical cognitive abilities, including reading comprehension, vocabulary learning, language comprehension, language acquisition, second language learning, spelling, story telling, fluid intelligence, and general intelligence (e.g., [16]). The correlation with fluid intelligence is especially important because fluid intelligence is one’s ability to solve novel problems. It is less influenced by learning and culture than general intelligence (IQ) and tied directly to one’s problem solving ability. Interestingly, WM has even more recently been referred to as the new intelligence [23]. Thus, the WM model is a natural heuristic for enquiring into the evolution of modern thinking.

As currently understood, WM is not a single, simple, neural system but a set of interlinked abilities. The current WM model, as set out by Baddeley [24], consists of an attentional pan modal processor (the “central executive”), two subsystems (the “phonological loop” and the “visuospatial sketchpad”), and a temporary memory store (the “episodic buffer”). The phonological loop is dedicated to auditory phenomena, and maintains and rehearses auditory information either vocally or subvocally. The visuospatial sketchpad is a distinct subsystem that processes visual information (shapes and locations). The two subsystems can operate simultaneously, so that, for example, one can perform a visuospatial task and a speech task with minimal interference. The episodic buffer holds information provided by the subsystems in active attention where it can be processed by the resources of the central executive. The central executive of WM performs most of the processing including attention, active inhibition (or, e.g., suppress distracting stimuli or prepotent responses), decision making, planning, sequencing, temporal tagging, and the updating of information in the two subsystems. It also serves as the chief liaison to long-term memory and language comprehension. The central executive takes control when novel tasks are encountered, and one of its most important functions is to override pre-existing habits and inhibit prepotent responses.

Neurologically, WM is primarily a network of the prefrontal cortex (PFC) but also relies on extensive linkages to parietal and temporal lobes and also connections to subcortical regions. The dorsolateral PFC has long been associated with executive functions, but there appears to be no single neural structure that can be isolated. The central executive, for example, appears to emerge from the interplay of diverse cortical and subcortical systems (e.g., [25]). The phonological loop may be the most isolable neural network of the system. Aboitiz et al. [26] have argued that the phonological loop is a specialized auditory-vocal sensorimotor circuit connecting posterior temporal areas with the inferior parietal lobe and the ventrolateral prefrontal cortex. What is clear, however, is that WM is a complex neural network consisting of neural pathways that interlink much of the neocortex. As such, adult phenotypes are likely to be the result of structural and regulatory genes governing neural development, and also individual developmental context.

Working memory is a trait that varies in modern populations, and the variability correlates with performance on several measures of intelligence, including language comprehension and planning. Much of this variability appears to be under strong genetic control. Coolidge et al. [27], in an analysis of child and adolescent twins as rated by parents, found that a core of executive functions including planning, organizing, and goal attainment, was highly heritable (77%) and most likely due to an additive (polygenic) genetic influence. In a study specifically focused on general WM functions, Ando et al. [28] found a strong additive genetic influence (43–49%). And on phonological storage capacity, Rijsdijk et al. [29] found a 61% additive heritability. Friedman et al. [30] demonstrated that executive functions are correlated because they are controlled by a highly heritable (99%) common factor that could not be explained by simple intelligence or perceptual speed, and yet they can be separated because of other genetic influences that may be unique to particular executive functions. They concluded that the combination of general and specific genetic influences makes the executive functions among the most heritable psychological traits.

Today working memory is arguably the most researched and well-understood model in cognitive neuroscience. As of 2007, more than 15,000 articles had been published containing the term “working memory” [31]. This much research is bound to engender disagreements and controversy but is also likely to make significant progress. The general features of WM are now well delimited and understood; it is the details that drive most research. As such it is a powerful model for understanding the evolution of the modern mind.

4. Methodological Considerations

The challenge in cognitive archaeology is methodological. The archaeological record itself is impoverished compared to the experimental and ethological data sets of most cognitive science. However, archaeological data are the residue of activities that occurred in the past, and as such are the only direct evidence of past behavior and past minds. Tapping this data reservoir requires carefully constructed arguments that must be both cognitively valid and archeologically credible. In practice, a valid cognitive archaeological argument must have three components.(1)The cognitive ability under investigation must be well defined by cognitive science. Common sense categories such as “abstract” or “complex” are just too vague to allow selection of valid attributes that could be applied to archaeological remains. Unfortunately, it is just such common-sense categories that underpin most archaeological arguments for modern cognition. It is noteworthy how rarely archaeologists have taken the trouble to inform themselves about cognitive science. Exceptions to this naivety include the work of Mithen [12], who based his analysis on concepts in developmental psychology, Davidson and Noble [10, 14], who drew on Wittgenstein and the ecological psychology of James Gibson, Ambrose [32], who has used the McDaniel and Einstein’s concept of prospective memory, and [33], who have relied on neurosemiotic theory [33]. Archaeologists must inform themselves about cognitive science if they wish to make substantive contributions to the study of the evolution of the human mind. The cognitive science literature is immense and diverse, and much like evolutionary science, there are many factions and schools of thought. One cannot simply dip into it and pull out a useable model. One must understand the intellectual context in which it developed and in which it is used. The payoff is well worth this effort—experimentally or ethologically justified descriptions of cognitive abilities.(2)The archaeologist must identify activities that would require the cognitive ability under investigation. This is the key methodological step. Unfortunately, the cognitive literature itself rarely addresses the kinds of activities that archaeologists can document. Such activities tend to be messy (metaphorically and actually) and difficult to operationalize in the laboratory. Some cognitive scientists do try to incorporate real world activities in their discussion, but for the most part archaeologists must themselves identify the appropriate activities, based either on their own experimental protocols (see, e.g., [34] or [35]), or on their understanding of the cognitive ability in question (see [36] for an example using spatial cognition). It is here that the value of explicit cognitive models becomes apparent. Because the WM model identifies response inhibition as an important component of the central executive, we can ask the tractable question “What activities require response inhibition?”, and generate a list of activities that would be visible archaeologically. A strict standard of parsimony must apply; the activities must require the cognitive ability. If an activity (say, specialized mammal hunting) could be performed using a less powerful form of cognition (e.g., procedural memory), then the less powerful form must be given precedence. (3)The archaeologist must define attributes of the activities that would preserve in archaeological record and which can reliably stand for the activity. This is the essential archaeological piece to the argument, and is a step required in any archaeologically based reasoning. One of the major challenges in this step is equifinality. Often many activities can produce identical or very similar archaeological residues (e.g., hunting versus high-end scavenging leave similar butchery traces). Again a strict standard of parsimony must apply; one must be confident of the link between archaeological traces and the reconstructed activity. For example, some (e.g., [37]) argue that evidence for Neandertal burial in Middle Eastern sites is evidence for modern symbolic ability. However, the evidence is more parsimoniously explained as minimal corpse treatment by Neandertals with strong emotional attachment, and grief at the loss of the deceased.

Archaeological credibility is no different for cognitive archaeology than it is for any other archaeological interpretation. The evidence must have been acquired by sound field and analytical techniques, and it must be reliably situated in time and space. These requirements are easily stated but not easily met. Indeed, one could argue that the preponderance of time, energy, and resources in any archaeological research is devoted to these practical issues. But this does not in turn mean that archeological credibility is more important than cognitive validity in the structure of a cognitive interpretation. Both are equally necessary.

5. Archaeological Evidence for Modern Working Memory Capacity

We have already set out our case for the first component of our archeological argument for the evolution of modern cognition. Working memory is a well-defined, voluminously documented component of the modern mind. Moreover, it is an ability that varies in modern people, and it is an ability possessed by nonhuman primates at a comparatively reduced capacity. Modern WM capacity must have evolved over the course of human evolution. Hints at increasing capacity (beyond an ape range) can be identified as far back as Homo erectus [17]. But when did it achieve modern levels, something we have labeled “enhanced working memory” (EWM)?

The second step in the analysis is to identify activities that require not just WM, but EWM. This presents two related practical problems. First, WM capacity is typically measured in terms of numbers of discrete items (e.g., terminal words in a series of read sentences). We cannot apply such tests in prehistory, and thus a simple quantitative measure is unavailable. We must rely on behavioral correlates of WM capacity. Second, psychological tests of WM capacity rarely include activities that would leave an archaeological signature; it is necessary for the archaeologist to select the appropriate activities. In practice this requires that we, as researchers, make ordinal comparisons of everyday tasks. Because these judgments are ordinal (e.g., more versus less), they are by nature not fine-grained. For example, we will argue that planning months and years in advance is a feature of modern executive thinking enabled by modern WM. It is fairly easy to cite modern examples. But what would constitute archaic WM? We can argue that prehistoric groups who did not demonstrate appropriate activities did not have modern WM, but it is effectively impossible to assign a number. Moreover, there is always the danger of under assessing WM if we rely on only a few kinds of activities. We must therefore use a variety of different activities if we wish to have reasonable confidence in our assessment. Below we will identify technological activities, subsistence activities, and information processing activities that we suggest are reliable indicators of EWM.

The final step in the analysis is to scour the archaeological record for the earliest credible evidence for the activity in question. There are several inherent pitfalls in this step. One, the problem of equifinality, we have already touched upon. A second is simple serendipity. Much of the evidence we seek requires good preservation—a rarity in palaeolithic sites—but we also need the good fortune to find such sites. Archaeologists have little control over these factors. However, we need not kowtow to the dictum “absence of evidence is not evidence of absence.” Often absence of evidence is evidence of absence. Nevertheless, it is always dangerous to conclude that evidence from one site will always be the oldest, or to adhere to too strict a chronology. As a corollary, it is important to use as many different kinds of archaeological evidence as possible. If the archaeological evidence for many different activities all point to the same chronological conclusion, then confidence in the conclusion improves. Finally, some archaeological evidence is direct—archeologists find physical remains of the activity. But some is indirect; the archaeological remains strongly imply the presence of the activity. For example, archaeologists have occasionally found actual traps made of wood and fiber. As you might suppose, these are very rare because the constituent materials rarely survive the ravages of time. The oldest such examples are only about 8,000 years old. So, was the first use of traps only 8,000 years old? Archaeologists think not, but the evidence is indirect, primarily in the form of animals that could not be effectively killed or captured without the use of traps. This indirect evidence pushes traps back to perhaps 70,000 years ago [38], a considerable difference.

6. Technical Evidence for Enhanced Working Memory

The irony for archaeologists is that technology is the most visible activity in the archaeological record, but one of the least likely to require the resources of EWM. Most tool making and tool use relies, often exclusively, on a style of thinking known as expertise or expert performance [3941]. This kind of thinking relies on procedural cognition and long-term memory—motor action patterns learned over years of practice and/or apprenticeship. It is also largely nonverbal. Very little of the problem solving ability of EWM is ever devoted to tool use. Instead, flexibility in tool use comes from the large range of procedures and solutions learned over years. The millions of stone tools produced over human evolution tell us mostly about this other cognitive system, not WM. It is not that WM was never used, just that it is almost impossible to eliminate procedural cognition as a candidate for the cognition behind the tool or use in question (e.g., equifinality and parsimony). Nevertheless, there are technical systems that do require EWM, and which cannot be reduced to procedural cognition. Most of the good examples (e.g., alloyed metals and kiln fired ceramics) appeared so late in human evolution as to engender little controversy or interest (ca. 6,000 years ago). There are just a few that extend much further back.

6.1. Traps and Snares

“Facility” is a term for relatively permanent immobile constructions built onto or into the landscape [42]. Perhaps the most common facilities used by hunters and gatherers are traps and snares, which are facilities designed to capture or kill animals (including fish). Facilities, including traps and snares, are often multicomponent gadgets, occasionally very heavy, that are time-consuming to build, and which operate remotely, occasionally in the absence of direct human engagement. It is the remote action that implicates EFs and EWM. To make a trap one must project present action toward a future, uncertain result. This requires the long range planning in space and time of modern EFs, and relies significantly on the response inhibition of the central executive of WM (delayed gratification).

Direct archaeological evidence for traps and snares, as mentioned above, have a relatively shallow antiquity. Actual wooden fish traps date back 4,500 years in North America, and a few thousand years earlier in Europe, that is, not much earlier than the alloyed metals passed over above. The oldest direct evidence of a kind of trap appears to be the “desert kites” of the Middle East [43]. These are lines of piled stone cairns, often hundreds of meters long, converging on a stone enclosure. There were used to hunt gazelle, and the oldest are about 12,000 years old.

Indirect evidence pushes traps and snares back to about 35,000 and perhaps even 75,000 years ago. At Niah cave on Borneo, Barker et al. [44] have evidence of extensive remains of bush pigs by about 35,000 BP, an animal best hunted using nets or snares. Similarly, Wadley [38] has recently argued that extensive blue duiker remains at Sibudu are indirect evidence for using traps by 70,000-year-old Middle Stone Age people in South Africa. In sum, traps and snares supply direct evidence for modern WM back to 12,000 BP, and indirect evidence back to 75,000 BP.

6.2. Reliable Weapons

Twenty-five years ago, Peter Bleed introduced a distinction in technical systems that has important cognitive implications, that between “maintainable” and “reliable” weapons [45]. The former require comparatively less effort to produce but are easier to fix (“maintain”) when necessary, for example, when damaged through use. Most stone tools, even from recent time periods, qualify as maintainable. Reliable weapons, on the other hand, are designed to assure function, that is, to reduce as far as possible the chances for failure. As such they tend to be over designed, complex in the sense of having several interrelated parts, hard to maintain, and often heavy. They often require long periods of “down time” for their construction and maintenance, and are most often intended to be deployed over short time spans of heavy use. Bleed developed this distinction as a way to understand the difference between simple stone tipped thrusting spears and the sophisticated projectile systems of North American Paleoindians, which included spear throwers, flexible aerodynamic shafts, replaceable foreshafts, and thin, fluted stone points. However, the distinction between maintainable and reliable applies generally to all technologies, not just weapons. The guiding principle behind reliable systems is that the investment of time and labor well in advance of need will maximize future success. More recently, Shea and Sisk [46] have taken a related but narrower focus and argued that the use of complex projectile weaponry (spear throwers and bows and arrows) is a good marker of modern technical prowess. “We use the term `complex projectile technology’ to refer to weapons systems that use energy stored exosomatically to propel relatively low mass projectiles at delivery speeds that are high enough to allow their user to inflict a lethal puncture wound on a target from a `safe’ distance” (p. 102). They consider this development significant enough to qualify as a derived feature of modern behavior. Reliable weapons, and in particular complex projectile weapons, rely on the executive function ability to plan over long stretches of time, and especially the response inhibition of WM (i.e., do not hunt today, even if you are hungry, but instead invest your effort in producing tools more likely to succeed tomorrow), and contingency planning (if the foreshaft breaks, slip in a new one; it is quicker than making an entire spear).

The archaeological record in North America clearly places reliable weapons back to Paleoindian times, at least 11,500 BP (roughly the same age as the earliest desert kites in the Near East, which were also reliable technical systems). Earlier examples rest on our ability to judge time investment and effectiveness of technical systems. Following Pike-Tay and Bricker [47], we believe that one earlier type of Palaeolithic artifact qualifies as being a component of a reliable system, and certainly an element of complex projectile technology—the bone and antler projectile points (a.k.a. sagaies) of the European Upper Palaeolithic. To make these artifacts, artisans used stone tools to remove appropriately sized blanks from a piece of bone or antler, often after soaking the raw material, and then carved the blanks into specific shapes (split based, barbed, etc.). Most were spear points hafted directly onto shafts, but others were harpoon heads, designed to come off the shaft while attached to a line. There are many examples of reworked points, attesting to the time required to make one from scratch. The most spectacular examples of such projectile points, which include the harpoons, date from the Late Upper Palaeolithic, about 14,000–18,000 BP, with slightly simpler systems extending back to 30,000 years ago. In Africa, bone points date back even earlier, perhaps as early as 90,000 years ago in the Congolese site of Katanda [46, 48]. The European evidence is more compelling because of the contemporary evidence for managed foraging (see below), and evidence for spear throwers and harpoons, which imply systems of gear. As yet the early African evidence consists of just the bone points, but it is provocative nonetheless.

6.3. Hafting

Hafting—attaching a stone tool to a shaft—has itself often been touted as a technological and even cognitive watershed in human evolution [13, 32]. Hafted tools represent the first time Palaeolithic people united separate elements into a single tool. These compound tools consist of three distinct elements: the stone tool (usually a spear point), the shaft, and the haft itself. It was the haft that was the challenge because it had to withstand significant impact forces when the tool was used. Spears with hafted stone points represent a clear escalation in the human-prey arms race, and it is fair to emphasize their importance in technological history. But their cognitive significance is harder to assess. Much hinges on how the hafting was done. A simple haft using a naturally available glue has different implications than a haft requiring days of soaking animal tendons followed by controlled, heated drying of the lashings on the shaft. The former is straightforward, single-sitting task, while the latter is a multiday procedure. In a sense, the former leans towards maintainable, the latter toward reliable in the maintainable-reliable continuum. It is only the latter that carries clear implications for EWM capacity. Hafting also calls out for a discussion of invention, the conscious design of an innovative technology. Someone had to design the first haft; it could not have occurred by accident. And it would be very informative, from a cognitive perspective, to know just how that person came up with the idea. The frustrating answer is that we just do not know. We can speculate, but our speculations cannot then be used as data for a cognitive interpretation.

The earliest evidence for hafting extends back probably 200,000 years in Europe, the Middle East, and Africa [4952], and includes examples by Neandertals and modern humans. Thus far, at least, these early hafts seem to be of the simpler, single-sitting task variety, though certainly collection of natural adhesives adds a component of complexity to the task, and Grünberg [53] and Koller et al. [54] have argued that the production of birch pitch required sophisticated knowledge of heating temperatures. It was only after 100,000 years ago that there is evidence for multiday hafting procedures. The best evidence comes from Sibudu in S. Africa (the same site as the indirect evidence hunting blue duikers with snares) at about 70,000 BP. Here hunters used a mixture of acacia gum, a little beeswax, and powdered ochre to produce an adhesive that had to be carefully dried using fire [34]. Although in theory such hafting could be accomplished by procedural cognition, the variety of constituents required for the adhesives, and the multiday procedure itself, imply the use of modern WM, particularly response inhibition and contingency planning.

To summarize, three lines of technical evidence are in broad agreement. Convincing archaeological evidence extends easily back to 18,000 years BP or so, but there are strong examples going back as far as 70,000 years BP in Africa. Earlier than that there is only the single example of simple hafts, which cannot alone bear the weight of assigning modern WM. It is important to reiterate that technology is not a domain of activity that easily documents WM capacity. Procedural cognition can be effective and flexible, and can encompass almost all technical activity. Certainly hafting, or even complex projectile technology, could not alone stand as evidence for modern executive thinking (nor, we should emphasize, do [46] make such a claim). Of the examples we cite, the only one that might stand alone as an argument for modern cognition is the example of traps. Technical evidence works better when it supports or corroborates evidence from other domains.

7. Foraging Systems

Next to technology the domain of activity most visible in the archeological record is subsistence—acquiring and processing food. And like technology, archaeologists’ arguments for modern subsistence systems have been heavily distorted by the record of the European Upper Palaeolithic, especially its later phases, which included examples of specialized hunting of single species such as reindeer or mammoth. These were no doubt impressive subsistence systems but specialization per se does not actually require the planning resources of modern EFs and WM. It can easily be organized and executed by expert procedural cognition. In fact this is arguably a more appropriate cognitive strategy because it consists of well-learned, automatic responses that can be selected and deployed quickly in dangerous situations. Neandertals were very good at this kind of thinking and, no surprise, we have extensive evidence for specialized hunting [55, 56]. Thus, it is necessary to eschew this war-horse of modernity and identify subsistence activities that actually do require modern EFs and WM.

Modern people manage their food supply. This is obvious in agricultural economies, where activities must be planned on a yearly scale (for nontropical systems). It clearly relies on the long-range planning of EFs and, more specifically, the response inhibition that is a key component of modern WM (e.g., retaining a portion of the harvest for replanting even in the presence of extreme want). But agriculture is not the only form of managed foraging. Most of the hunting and gathering systems archaeologists have recognized as “complex” also qualify [57]. Good recent ethnographic examples include foragers of the Northwest Coast of America, the Arctic, and Australia. In Northern and Western Australia, hunter-gatherers systematically burn tracts of land in order to encourage a second green-up of grass, which attracts herbivores. They rotate the tract to be burned every year, and do not return to a tract for at least a decade [58]. This is a managed system, with planning over long periods, and response inhibition. Another component of modern hunting and gathering systems is a marked division of labor by age and sex [59]. It requires coordination of separate labor pools, which weakly implicates WM and its executive functions (organization, delegation, disputation, etc.), but more importantly is manifested in the tropics by increased reliance on small, seasonal resources (plants and small animals) that require scheduled harvesting, typically by women and children.

Archaeological evidence for agriculture extends back to 10,000 years BP on several continents, and evidence for managed forms of hunting and gathering back another several thousand years in the guise of Archaic, Mesolithic, and Epipalaeolithic cultures all over the world. An especially good example is that of the Epipalaeolithic site of Abu Hureyra in Syria [43]. Here a group of hunters and gatherers established a sedentary community based on hunting gazelle and gathering a wide variety of local plants. When the local conditions became much drier 11,000–10,000 years ago these people did not simply shift the focus of their hunting and gathering; they changed its very basis by beginning to cultivate rye. The interesting point is not so much the broad spectrum hunting and gathering but the inventive response to changing conditions. These people were clearly using the planning abilities enabled by EWM.

Finding evidence for managed foraging that is earlier than the end of the Pleistocene is fraught with problems, mostly linked to preservation, but also to mobility patterns of earlier hunter-gatherers who rarely settled in permanent sites like Abu Hureyra. The amount of refuse is much less, and harder to characterize. Nevertheless, there are several provocative earlier examples. A well-known example is that of late Upper Palaeolithic reindeer hunters [60] of southwestern Europe (ca. 18,000 years BP). Here it is not the specialization that is telling (see above), but the evidence for a tightly scheduled hunting system in which herds were intercepted and slaughtered at specific locations during migrations, but at other times of the year were hunted individually using a different set of tactics. Though other resources were used, reindeer were the clear focus year-round, using a seasonally adjusted strategy that included periods of down-time during which the hunters made and maintained their complex technical gear (see above). At about the same time, hunters on the Russian Plain used a system in which they killed large numbers of animals during late summer/early fall and then cached large quantities of meat in underground storage pits for freezing and future consumption [61]. Storage and delayed consumption are strong evidence for modern WM.

Earlier evidence is largely indirect. At Niah Cave on the island of Borneo [44], archaeologists have recovered large quantities of pollen from plants that flourish on recently burned areas. The local tropical conditions are quite wet, and the pollen far exceeds what one would normally expect to find, suggesting extensive human-induced burning. This evidence dates to sometime between 42,000 and 28,000 years BP. Earlier still is the evidence for hunting blue duiker in South Africa using snares or traps (70,000 years BP, see above). Of similar antiquity is evidence from other South African sites for extensive use of corms (fleshy, semisubterranean stems), which are features of plants that flourish on burned landscape, suggesting as at Niah human use of fire as an ecology altering tool [62]. Kuhn and Stiner [59] argue that this broadening of the subsistence base in South Africa is an indication of division of labor by age and sex. In sum, the archaeological evidence for managed foraging parallels the evidence of technology. There is a strong signature going back 18,000 years or so, and a weaker, but still provocative, set of isolated examples going back 70,000 years.

8. Information Processing

Thus far in our discussion we have focused primarily on the long-range planning and response inhibition components of modern EFs and EWM and have traced them archaeologically through the technological and subsistence records. We now shift focus to problem solving, another of the executive functions enhanced through an increase in WM capacity. Working memory is the active problem solving “space” of the modern mind. We use WM to construct analogies, perform thought experiments, make contingency plans, and even make metaphors. It is how and where we bring things together in thought; however, even modern WM has a limited capacity, because the episodic buffer is a limited capacity store. If the capacity of this store is depleted by holding raw information, little comparison and processing can also occur (try multiplying two four-digit numbers in your head). One solution to the problem that modern humans regularly use is externalization of some of this information, that is, holding the information outside of the mind itself. This is an aspect of extended cognition, which has recently received significant attention in cognitive science [63], and even archaeology [64, 65]. Our interest is on the implications that extended cognition holds for WM, and the primary effect is to extend WM capacity by relieving the necessity to hold information in the episodic buffer, thereby freeing capacity for the processing components of the central executive. Examples of such externalized storage systems abound in the modern world—writing, numbers, calculators, and so on. External systems need not be artifactual—one can, for example, count on one’s fingers—but they often are artifactual, which gives us an avenue to follow into the prehistoric past.

It is uncontroversial to assert that early writing and accounting systems, which date back at least 5,000 years, were external information storage. Systems of clay tokens, used for accounting purposes, extend the record back several thousand years into the early Neolithic [65, 66]. We pick up the trail about 12,000 years ago at the site of Grotte du Tai in western France [67].

The plaque in Figure 1 above appears to have been a record keeping device. Someone engraved a series of long lines crossed by groups of slashes on a piece of flat bone. Marshack, and later d’Errico [68], examined the markings microscopically and determined that they were produced by different tools, probably in different episodes. Marshack famously argued that it was a lunar calendar, but d’Errico concluded simply that it was an external memory device. We do not know what the engraver was tracking, but it was clearly something, attesting to a desire to externalize information, thereby freeing up WM capacity for processing. Similar objects, not quite as elaborate, date back in Europe to about 28,000 years BP [2].

Earlier still, and equally provocative are therianthropic figurines from Germany, the most famous of which is the Hohlenstein-Stadel figurine (see Figure 2). This is an image of lion-headed person (or human-bodied lion) carved in elephant ivory, roughly 28 cm high. It is about 32,000 years old [69]. It is certainly an evocative piece, and has inspired much discussion about symbolism and Upper Palaeolithic religious thinking. It also has a number of important implications for cognition [70], one of which concerns WM. The figurine is an externalized abstraction. Such a creature does not exist in the real world, and it must have been metaphorically glued together, initially at least, in the WM of some Upper Palaeolithic person. The problems people need to solve are not always practical issues in day-to-day life. They are also social, and even metaphysical. The Hohlenstein-Stadel figurine is the externalization of such a metaphysical problem, and its externalized presence frees up WM of the artisan, and also observers, to ponder other related existential issues.

So far, our discussion of externalized information processing has not yielded any surprises. Suggesting that Upper Palaeolithic people in Europe 32,000 years ago exercised modern cognition is neither a novel nor a controversial conclusion. But what about earlier? Can we push externalized information back as far as the early evidence of traps, or managed foraging? The answer is yes, but it requires a slightly different take on a famous set of artifacts—the Blombos beads.

Blombos Cave is a site on the coast of South Africa whose Middle Stone Age levels date back at least 77,000 years. These MSA levels have famously yielded engraved bones, shaped and engraved pieces of ochre, bone awls, and marine shell beads [3, 33, 71]. These are among the earliest putatively modern artifacts yet found, and make a strong case for extending many of the components of modern behavior and cognition back to this early period. But the initial enthusiasm has recently been tempered by more sober critiques. d’Errico and Henshilwood [71], for example, argued that the presence of decorative beads indicates that the inhabitants had fully syntactical language. This conclusion was then elegantly challenged by Botha [72], who pointed out that d’Errico and Henshilwood had not made explicit and convincing bridging arguments linking beads to language. Henshilwood and Dubreuil [33] have replied, providing part of the linkage (a very nice example of a productive scholarly exchange), but the implications of the beads are still not entirely clear. We suggest that an alternative approach is to look at these artifacts not in their possible symbolic role, but as externalized information storage. Henshilwood and Botha agree that the shells with punched holes were beads, and that the beads were worn as ornaments. But why does one wear beads? One answer is that one wears beads to send information about oneself to another person. This could be an explicit message about social status (“I am an adult”, “I am wealthy”, and so on), or an implicit message (“I am a good mate prospect”), but by changing how others view the wearer, the wearer is externalizing information about him or herself. Curiously, there is an alternative function for these beads that neither Henshilwood nor Botha have considered. They might have been tally devices, used to keep track of (remember) some sequential phenomena (much like rosary beads). The social implications of this option are different from the decorative bead interpretation, but the information implications are similar: beads were an externalized store of information, freeing WM to devote space in the episodic buffer for processing information, rather than just holding it in attention.

This evidence of externalized information storage is provocative. We live in a modern world where externalized information has come to dominate, perhaps even overwhelm, our daily lives, and the thought that it had its roots far back in the stone age is certainly provocative. However, for our topic at hand—working memory—external storage of information is actually an ambiguous signature. Externalization of information would release storage space for episodic buffers of whatever capacity, not just modern capacity. There need only be enough capacity to hold the external device as a token of some kind in the episodic buffer as one performs processes upon it. Because of this ambiguity, we suggest that evidence for the use of external storage devices cannot, on its own, provide compelling evidence for modern WM. It can, however, stand as corroborating evidence for assessments established by other means. As such, it supports the picture painted using technology and subsistence, that is, strong evidence in this case going back to perhaps 30,000 years, and weaker evidence extending further to 77,000 BP.

9. Conclusion and Discussion

At the outset of this essay we noted that archaeologists who study the problem of modern human origins typically address two rather different subgoals—the emergence of modern culture/cognition, and/or the specific evolutionary scenario by which one or more archaic populations made the transition to modernity. Our documentation of the final enhancement of working memory is primarily a contribution to the first of these subgoals. Evidence from neuroscience clearly identifies a planning and problem solving ability that is isolable neurologically and behaviorally from symbolic and language abilities. This component of modern thinking is working memory. Archaeological evidence indicates that human WM capacity underwent an enhancement to the modern range in the relatively recent past. Given the serendipitous nature of archaeological preservation and recovery, assigning a precise date for this development is not yet possible. Modern WM capacity was certainly in place by 30,000 years ago, but there is scattered evidence that it may be as old as 77,000 years ago. Despite this range of dates, it is clear that an enhancement of WM capacity was one of the final developments in human cognition that produced the modern mind.

Perhaps not unexpectedly, most of the criticism of our hypothesis has been aimed at its implications for the second sub-goal pursued by archaeologists—the narrative scenario of just who became modern [73, 74]. We admit to fueling this fire by directly addressing the issue of Neandertal cognition [41, 75, 76]. The archaeological signature of Neandertals is well known, but does not provide evidence for the enhancement of WM that can be found elsewhere. But beyond this fairly direct contrast, the archaeological record for EWM fits several alternative scenarios for just who become modern and when they did.(1)Alleles for enhanced WM accompanied the parietal hypertrophy that distinguishes the brains of Homo sapiens sapiens from those of archaic Homo sapiens such as Neandertals [77]. The parietal hypertrophy is not itself evidence for an increase in WM capacity because WM is primarily a frontal lobe function, with significant neural links to the parietal and temporal lobes, and the basal ganglia. But clearly something did evolve in the brains of Homo sapiens sapiens by about 200,000 years ago [50], and perhaps WM capacity accompanied this development. If true, it leaves us with a chronological gap. Archaeological evidence for enhanced WM does not appear until at least 130,000 years later. There are two ways to account for this lacuna.(a)The alleles for enhanced WM at first yielded only a very modest reproductive advantage, and it was not until a significant proportion of the population (presumably African) expressed the enhanced phenotype that group planning and problem solving began to provide a more marked advantage, which powered a subsequent rapid expansion after 70,000 years ago. We do not know whether or not such a sequence could even be modeled in microevolutionary terms, and at this point it remains conjecture.(b)The alleles for enhanced WM yielded an immediate phenotypic advantage, which resulted in modern problem solving ability, but because its expression played out through learned cultural mechanisms, the ratchet effect initially allowed for only slow, almost imperceptible change. Essentially, enhanced WM had little to work with until cultural knowledge had accrued more and more components. If the archaeological record was more complete, we would see an accelerating rate of culture change over the course of 150,000 years. Some would argue that this is precisely what we do see, but given the limited number of data points such a conclusion is unwarranted, at least for now.

We find the 100,000–150,000 year gap between the first anatomically modern humans, and evidence for EWM to be troubling, and are unconvinced, thus far, by either solutions (a) or (b). Alleles for enhanced WM occurred by mutation in anatomically modern African population after 200,000 years ago, and probably after 100,000 years ago. Here the serendipity of archaeological discovery clouds the chronology. Some evidence—for example, the indirect evidence for use of snares at Sibudu [38]—is as old as 75,000 BP. But abundant evidence for enhanced WM did not appear until about 30,000 year ago. As such, the archaeological record does fit an “Out of Africa” scenario in which a local southern or eastern population of modern humans expanded rapidly out of Africa following the demographic crash that occurred about 70,000 years ago [6]. It may well turn out, when the archaeological record for southern and eastern Africa are more thoroughly documented, that this version will prevail. But this is not the only feasible scenario. It is also possible that the final enhancement of WM occurred closer to 30,000 years BP, and spread rapidly via gene flow to populations all over the world. The archaeological evidence for enhanced WM just does not have the resolution to resolve this specific evolutionary puzzle.

A final observation concerning human cognitive evolution in general is appropriate. Evidence for the evolution of working memory fits nicely into a mosaic account of human cognitive evolution. Some components of modern cognition evolved long ago. Spatial cognition, for example, was modern by 500,000 years ago [78, 79], and evolved in circumstances very different from those of the last 100,000 years. Procedural cognition also has considerable antiquity, with archaic humans such as Homo heidelbergensis and Neandertals demonstrating modern procedural abilities [41, 80]. Symbolism, though a poorly defined cognitive ability, has roots stretching back perhaps 300,000 years (pigment use at Twin Rivers in Zambia [81]). In a very real sense, search for the evolution of modern cognition is a fool’s game. The components of modern cognition, like the components of modern anatomy, evolved at different times for different reasons. True, the final package did not come together until after 100,000 years ago, with working memory perhaps the final piece. But this was preceded by many other developments equally important to the modern mind.