Working memory

Working memory is a theoretical framework within cognitive psychology that refers to the structures and processes used for temporarily storing and manipulating information. There are numerous theories as to both the theoretical structure of working memory (see the "organizational map" that follows) as well as to the specific parts of the brain responsible for working memory. However, it is of the accepted view that the frontal cortex, parietal cortex, anterior cingulate, and parts of the basal ganglia are crucial for functioning. Much of the understanding of the neural basis of working memory has come from lesion experiments in animals and imaging experiments in humans.

Today there are hundreds of research laboratories around the world studying various aspects of working memory. There are numerous applications of working memory in the field, such as using working memory capacity to explain intelligence and other cognitive abilities, furthering the understanding of autism and ADHD, improving teaching methods, and creating artificial intelligence based on the human brain.

History
The term was first used in the 1960s in the context of theories that likened the mind to a computer. Before then, what we now call working memory was referred to as short-term memory, sometimes also as primary memory, immediate memory, operant memory, or provisional memory. Short-term memory is the ability to remember information over a brief period of time (in the order of seconds). Most theorists today use the concept of working memory to replace or include the older concept of short-term memory, thereby marking a stronger emphasis on the notion of manipulation of information instead of passive maintenance.

The earliest mention of experiments on the neural basis of working memory can be traced back to over 100 years ago, when Hitzigand Ferrier described ablation experiments of the prefrontal cortex (PFC). They concluded that the frontal cortex was important for cognitive processes rather than sensory ones. In 1935 and 1936, Jacobsen and colleagues were the first to conclude that the cognitive processes in the PFC were notable in delay-dependent tasks; in other words, they suffered from short-term memory loss.

Theories of Working Memory
There have been numerous models proposed regarding how working memory functions, both anatomically and cognitively. Of those, three have received the distinct notice of wide acceptance.

The Baddeley and Hitch model
Baddeley and Hitch (1974) introduced and made popular the multicomponent model of working memory. This theory proposes that two "slave systems" are responsible for short-term maintenance of information, and a "central executive" is responsible for the supervision of information integration and for coordinating the slave systems. One slave system, the phonological loop, stores phonological information (i.e., the sound of language) and prevents its decay by continuously articulating its contents, thereby refreshing the information in a rehearsal loop. It can, for example, maintain a seven-digit telephone number for as long as one repeats the number to oneself again and again. The other slave system, the visuo-spatial sketch pad, stores visual and spatial information. It can be used, for example, for constructing and manipulating visual images, and for the representation of mental maps. The sketch pad can be further broken down into a visual subsystem (dealing with, for instance, shape, colour, and texture), and a spatial subsystem (dealing with location). The central executive (see executive system) is, among other things, responsible for directing attention to relevant information, suppressing irrelevant information and inappropriate actions, and for coordinating cognitive processes when more than one task must be done at the same time. Baddeley (2000) extended the model by adding a fourth component, the episodic buffer, which holds representations that integrate phonological, visual, and spatial information, and possibly information not covered by the slave systems (e.g., semantic information, musical information). The component is episodic because it is assumed to bind information into a unitary episodic representation. The episodic buffer resembles Tulving's concept of episodic memory, but it differs in that the episodic buffer is a temporary store.

The Theory of Cowan
Cowan regards working memory not as a separate system, but as a part of long-term memory. Representations in working memory are a subset of the representations in long-term memory. Working memory is organized in two embedded levels. The first level consists of long-term memory representations that are activated. There can be many of these, there is no limit to activation of representations in long-term memory. The second level is called the focus of attention. The focus is regarded as capacity limited and holds up to four of the activated representations. Oberauer has extended the Cowan model by adding a third component, a more narrow focus of attention that holds only one chunk at a time. The one-element focus is embedded in the four-element focus and serves to select a single chunk for processing. For example, you can hold four digits in mind at the same time in Cowan's "focus of attention". Now imagine that you wish to perform some process on each of these digits, for example, adding the number two to each digit. Separate processing is required for each digit, as most individuals can not perform several mathematical processes in parallel. Oberauer's attentional component selects one of the digits for processing, and then shifts the attentional focus to the next digit, continuing until all of the digits have been processed.

The Theory of Ericsson and Kintsch
Whereas most adults can repeat about seven digits in correct order, some individuals have shown impressive enlargements of their digit span - up to 80 digits. This feat is possible by extensive training on an encoding strategy by which the digits in a list are grouped (usually in groups of three to five) and these groups are encoded as a single unit (a chunk). To do so one must be able to recognize the groups as some known string of digit. One person studied by K. Anders Ericsson and his colleagues, for example, used his extensive knowledge of racing times from the history of sports. Several such chunks can then be combined into a higher-order chunk, thereby forming a hierarchy of chunks. In this way, only a small number of chunks at the highest level of the hierarchy must be retained in working memory. At retrieval, the chunks are unpacked again. That is, the chunks in working memory act as retrieval cues that point to the digits that they contain. It is important to note that practicing memory skills such as these do not expand working memory capacity proper. This can be shown by using different materials - the person who could recall 80 digits was not exceptional when it came to recall words. Ericsson and Kintsch (1995) have argued that we use skilled memory in most everyday tasks. Tasks such as reading, for instance, require to maintain in memory much more than seven chunks - with a capacity of only seven chunks our working memory would be full after a few sentences, and we would never be able to understand the complex relations between thoughts expressed in a novel or a scientific text. We accomplish this by storing most of what we read in long-term memory, linking them together through retrieval structures. We need to hold only a few concepts in working memory, which serve as cues to retrieve everything associated to them from by the retrieval structures. Anders Ericsson and Walter Kintsch refer to this set of processes as "long-term working memory".

Working memory capacity
Working memory is generally considered to have limited capacity. The earliest quantification of the capacity limit associated with short-term memory was the "magical number seven" introduced by Miller (1956). He noticed that the memory span of young adults was around seven elements, called chunks, regardless whether the elements were digits, letters, words, or other units. Later research revealed that span does depend on the category of chunks used (e.g., span is around seven for digits, around six for letters, and around 5 for words), and even on features of the chunks within a category. For instance, span is lower for long words than for short words. In general, memory span for verbal contents (digits, letters, words, etc.) strongly depends on the time it takes to speak the contents aloud, and on the lexical status of the contents (i.e., whether the contents are words known to the person or not). Several other factors also affect a person's measured span, and therefore it is difficult to pin down the capacity of short-term or working memory to a number of chunks. Nonetheless, Cowan (2001) has proposed that working memory has a capacity of about four chunks in young adults (and less in children and old adults).

Measures of working-memory capacity and their correlates
Working memory capacity can be tested by a variety of tasks. A commonly used measure is a dual-task paradigm combining a memory span measure with a concurrent processing task. For example, (Daneman & Carpenter, 1980) used "reading span". Subjects read a number of sentences (usually between 2 and 6) and try to remember the last word of each sentence. At the end of the list of sentences, they repeat back the words in their correct order. Other tasks that don't have this dual-task nature have also been shown to be good measures of working memory capacity. The question of what features a task must have to qualify as a good measure of working memory capacity is a topic of ongoing research.

Measures of working-memory capacity are strongly related to performance in other complex cognitive tasks such as reading comprehension, problem solving, and with any measures of the intelligence quotient. Some researchers have argued that working memory capacity reflects the efficiency of executive functions, most notably the ability to maintain a few task-relevant representations in the face of distracting irrelevant information. The tasks seem to reflect individual differences in ability to focus and maintain attention, particularly when other events are serving to capture attention. These effects seem to be a function of frontal brain areas.

Others have argued that the capacity of working memory is better characterized as the ability to mentally form relations between elements, or to grasp relations in given information. This idea has been advanced, among others, by Graeme Halford, who illustrated it by our limited ability to understand statistical interactions between variables. These authors asked people to compare written statements about the relations between several variables to graphs illustrating the same or a different relation, for example "If the cake is from France then it has more sugar if it is made with chocolate than if it is made with cream but if the cake is from Italy then it has more sugar if it is made with cream than if it is made of chocolate". This statement describes a relation between three variables (country, ingredient, and amount of sugar), which is the maximum most of us can understand. The capacity limit apparent here is obviously not a memory limit - all relevant information can be seen continuously - but a limit on how many relationships we can discern simultaneously.

It has been suggested that working memory capacity can be measured as the capacity C of short-term memory (measured in bits of information), defined as the product of the individual mental speed Ck of information processing (in bit/s) ( see the external link below to the paper by Lehrl and Fischer (1990) ), and the duration time D (in s) of information in working memory, meaning the duration of memory span. Hence:


 * C (bit) = Ck(bit/s) &times; D (s).

Lehrl and Fischer measured speed by reading rate. They claimed that C is closely related to general intelligence. Roberts, Pallier, and Stankov have shown, however, that C measures little more than reading speed. Moreover, the idea that working memory capacity can be measured in terms of bits has long been discredited by the work of Miller (1956), who demonstrated that working memory capacity depends on the number of chunks, not the number of bits (chunks can vary dramatically in how many bits they carry: a sequence like "1 0 0 1 0 1 1" consists of seven chunks worth seven bits - less than a single word, which is just one chunk).

Experimental studies of working memory capacity
Why is working memory capacity limited at all? If we knew the answer to this question, we would understand much better why our cognitive abilities are as limited as they are. There are several hypotheses about the nature of the capacity limit. One is that there is a limited pool of cognitive resources needed to keep representations active and thereby available for processing, and for carrying out processes. Another hypothesis is that memory traces in working memory decay within a few seconds, unless refreshed through rehearsal, and because the speed of rehearsal is limited, we can maintain only a limited amount of information. Yet another idea is that representations held in working memory capacity interfere with each other. There are several forms of interference discussed by theorists. One of the oldest ideas is that new items simply replace older ones in working memory. Another form of interference is retrieval competition. For example, when the task is to remember a list of 7 words in their order, we need to start recall with the first word. While trying to retrieve the first word, the second word, which is represented in close proximity, is accidentally retrieved as well, and the two compete for being recalled. Errors in serial recall tasks are often confusions of neighboring items on a memory list (so-called transpositions), showing that retrieval competition plays a role in limiting our ability to recall lists in order, and probably also in other working memory tasks. A third form of interference assumed by some authors is feature overwriting. The idea is that each word, digit, or other item in working memory is represented as a bundle of features, and when two items share some features, one of them steals the features from the other. The more items are held in working memory, and the more their features overlap, the more each of them will be degraded by the loss of some features.

None of these hypotheses can explain the experimental data entirely. The resource hypothesis, for example, was meant to explain the trade-off between maintenance and processing: The more information must be maintained in working memory, the slower and more error prone concurrent processes become, and with a higher demand on concurrent processing memory suffers. This trade-off has been investigated by tasks like the reading-span task described above. It has been found that the amount of trade-off depends on the similarity of the information to be remembered and the information to be processed. For example, remembering numbers while processing spatial information, or remembering spatial information while processing numbers, impair each other much less than when material of the same kind must be remembered and processed. Also, remembering words and processing digits, or remembering digits and processing words, is easier than remembering and processing materials of the same category. These findings are also difficult to explain for the decay hypothesis, because decay of memory representations should depend only on how long the processing task delays rehearsal or recall, not on the content of the processing task. A further problem for the decay hypothesis comes from experiments in which the recall of a list of letters was delayed, either by instructing participants to recall at a slower pace, or by instructing them to say an irrelevant word once or three times in between recall of each letter. Delaying recall had virtually no effect on recall accuracy. The interference hypothesis seems to fare best with explaining why the similarity between memory contents and the contents of concurrent processing tasks affects how much they impair each other. More similar materials are more likely to be confused, leading to retrieval competition, and they have more overlapping features, leading to more feature overwriting. One experiment that directly manipulated the amount of overlap of phonological features between words to be remembered and other words to be processed. Those to-be-remembered words that had a high degree of overlap with the processed words were recalled worse, lending some support to the idea of interference through feature overwriting.

The theory most successful so far in explaining experimental data on the interaction of maintenance and processing in working memory is the "time-based resource sharing model". This theory assumes that representations in working memory decay unless they are refreshed. Refreshing them requires an attentional mechanism that is also needed for any concurrent processing task. When there are small time intervals in which the processing task does not require attention, this time can be used to refresh memory traces. The theory therefore predicts that the amount of forgetting depends on the temporal density of attentional demands of the processing task - this density is called "cognitive load". The cognitive load depends on two variables, the rate at which the processing task requires individual steps to be carried out, and the duration of each step. For example, if the processing task consists of adding digits, then having to add another digit every half second places a higher cognitive load on the system than having to add another digit every two seconds. Adding larger digits takes more time than adding smaller digits, and therefore cognitive load is higher when larger digits must be added. In a series of experiments, Barrouillet and colleagues have shown that memory for lists of letters depends on cognitive load, but not on the number of processing steps (a finding that is difficult to explain by an interference hypothesis) and not on the total time of processing (a finding difficult to explain by a simple decay hypothesis). One difficulty for the time-based resource-sharing model, however, is that the similarity between memory materials and materials processed also affects memory accuracy.

Training of working memory
Recent studies suggest that working memory can be improved by training in ADHD patients. This study has found that a period of working memory training increases a range of cognitive abilities and increases IQ test scores. Consequently, this study supports previous findings suggesting that working memory underlies general intelligence. Improving or augmenting the brain's working memory ability may prove to be a reliable method for increasing a person's IQ scores. However, a recent theory of ADHD states that ADHD can lead to deficits in working memory.

Another study of the same group has shown that, after training, measured brain activity related to working memory increased in the prefrontal cortex, an area that many researchers have associated with working memory functions.

Localizing working memory functions
The first insights into the neuronal basis of working memory came from animal research. Fuster recorded the electrical activity of neurons in the prefrontal cortex (PFC) of monkeys while they were doing a delayed matching task. In that task, the monkey sees how the experimenter places a bit of food under one of two identical looking cups. A shutter is then lowered for a variable delay period, screening off the cups from the monkey’s view. After the delay, the shutter opens and the monkey is allowed to retrieve the food from under the cups. Successful retrieval in the first attempt – something the animal can achieve after some training on the task – requires holding the location of the food in memory over the delay period. Fuster found neurons in the PFC that fired mostly during the delay period, suggesting that they were involved in representing the food location while it was invisible. Later research has shown similar delay-active neurons also in the posterior parietal cortex, the thalamus, the caudate, and the globus pallidus.

Localization of brain functions in humans has become much easier with the advent of brain imaging methods (PET and fMRI). This research has confirmed that areas in the PFC are involved in working memory functions. During the 1990s much debate has centered on the different functions of the ventrolateral (i.e., lower areas) and the dorsolateral (higher areas) of the PFC. One view was that the dorsolateral areas are responsible for spatial working memory and the ventrolateral areas for non-spatial working memory. Another view proposed a functional distinction, arguing that ventrolateral areas are mostly involved in pure maintenance of information, whereas dorsolateral areas are more involved in tasks requiring some processing of the memorized material. The debate is not entirely resolved but most of the evidence supports the functional distinction.

Brain imaging has also revealed that working memory functions are by far not limited to the PFC. A review of numerous studies shows areas of activation during working memory tasks scattered over a large part of the cortex. There is a tendency for spatial tasks to recruit more right-hemisphere areas, and for verbal and object working memory to recruit more left-hemisphere areas. The activation during verbal working memory tasks can be broken down into one component reflecting maintenance, in the left posterior parietal cortex, and a component reflecting subvocal rehearsal, in the left frontal cortex (Broca’s area, known to be involved in speech production) ).

There is an emerging consensus that most working memory tasks recruit a network of PFC and parietal areas. One study has shown that during a working memory task the connectivity between these areas increases. Other studies have demonstrated that these areas are necessary for working memory, and not just accidentally activated during working memory tasks, by temporarily blocking them through transcranial magnetic stimulation (TMS), thereby producing an impairment in task performance.

A current debate concerns the function of these brain areas. The PFC has been found to be active in a variety of tasks that require executive functions. This has led some researchers to argue that the role of PFC in working memory is in controlling attention, selecting strategies, and manipulating information in working memory, but not in maintenance of information. The maintenance function is attributed to more posterior areas of the brain, including the parietal cortex. Other authors interpret the activity in parietal cortex as reflecting executive functions, because the same area is also activated in other tasks requiring executive attention but no memory

Most brain imaging studies of working memory have used recognition tasks such as delayed recognition of one or several stimuli, or the n-back task, in which each new stimulus in a long series must be compared to the one presented n steps back in the series. The advantage of recognition tasks is that they require minimal movement (just pressing one of two keys), making fixation of the head in the scanner easier. Experimental research and research on individual differences in working memory, however, has used largely recall tasks (e.g., the reading span task, see above). It is not clear to what degree recognition and recall tasks reflect the same processes and the same capacity limitations.

A few brain imaging studies have been conducted with the reading span task or related tasks. Increased activation during these tasks was found in the PFC and, in several studies, also in the anterior cingulate cortex (ACC). People performing better on the task showed larger increase of activation in these areas, and their activation was correlated more over time, suggesting that their neural activity in these two areas was better coordinated, possibly due to stronger connectivity.

How does the brain maintain memories over the short term?
Much has been learned over the last two decades on where in the brain working memory functions are carried out. Much less is known on how the brain accomplishes short-term maintenance and goal-directed manipulation of information. The persistent firing of certain neurons in the delay period of working memory tasks shows that the brain has a mechanism of keeping representations active without external input.

Keeping representations active, however, is not enough if the task demands maintaining more than one chunk of information. In addition, the components and features of each chunk must be bound together to prevent them from being mixed up. For example, if a red triangle and a green square must be remembered at the same time, one must make sure that “red” is bound to “triangle” and “green” is bound to “square”. One way of establishing such bindings is by having the neurons that represent features of the same chunk fire in synchrony, and those that represent features belonging to different chunks fire out of sync. In the example, neurons representing redness would fire in synchrony with neurons representing the triangular shape, but out of sync with those representing the square shape. So far, there is no direct evidence that working memory uses this binding mechanism, and other mechanism have been proposed as well. It has been speculated that synchronous firing of neurons involved in working memory oscillate with frequencies in the theta band (4 to 8 Hz). Indeed, the power of theta frequency in the EEG increases with working memory load, and oscillations in the theta band measured over different parts of the skull become more coordinated when the person tries to remember the binding between two components of information.