• No results found

Talking Language: On the evolutionary origins and computations of language

N/A
N/A
Protected

Academic year: 2021

Share "Talking Language: On the evolutionary origins and computations of language"

Copied!
62
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

On the evolutionary origins and computations of language

Author

Braunsdorf, Marius marius.braunsdorf@student.uva.nl

Universiteit van Amsterdam

1st Supervisor Dr. van Elk, Michiel

m.vanelk@uva.nl Universiteit van Amsterdam

2nd Supervisor

Dr. Ploeger, Annemie A.Ploeger@uva.nl Universiteit van Amsterdam

(2)

Abstract

Language is one of the most unique properties of humans. No other animal possesses such an elaborated system for communicating thoughts. No wonder than that the faculty of language has been extensively studied in the cognitive- and neurosciences. Within those fields different paradigms developed approaching human language use. The cur-rent work assesses the evolutionary history of language and the features rendering it special. Followed by an introduction concerning language I introduce two paradigms-massive modularity and predictive coding along with their respective contributions to understanding language. The two frameworks presented manifest themselves as very dif-ferent approaches to understanding the human mind. While massive modularity aim at explaining how and why certain behaviours are special and how they developed, pre-dictive processing gained popularity through its intriguing approach of explaining the human mind holistically with a unifying computational principle. This reviews closes with a careful examination of the differences between the two frameworks with respect to the questions raised and answered along with critiques the paradigms have to face. Furthermore other theoretical paradigms are juxtaposed and integrated into cutting edge knowledge about the brain with respect to the explanatory power the paradigms offer with respect to how the brain works.

(3)

1 The Evolution of Language 1

1.1 What is Language Good for? . . . 1

1.2 Why did Language Evolve? . . . 2

1.3 How did Language Evolve? . . . 3

2 The Faculty of Language 6 2.1 The Sensory-motor System . . . 7

2.2 The Conceptual-intentional Interface . . . 8

2.3 The Computational Hub . . . 10

2.3.1 Recursion . . . 10

2.3.2 Structural vs. Linear Distance . . . 11

2.3.3 Merge . . . 11

2.3.4 Displacement . . . 13

3 Modularity of Mind 15 3.1 Evolutionary Psychology . . . 15

3.2 History of the Modularity of Mind . . . 16

3.3 The Massive Modularity Hypothesis . . . 17

3.4 Modularity of the Sensory-motor System . . . 21

3.5 Modularity of the Conceptual-intentional Interface . . . 23

3.6 Modularity of the Computational Hub . . . 25

4 Predictive Processing 27 4.1 Predictive processing in the Sensory-motor System . . . 31

4.2 Predictive Processing in the Conceptual-intentional Interface . . . 33

4.3 Predictive Processing in the Computational Hub . . . 36

5 Conclusion 39 5.1 Specificity in the Brain . . . 39

5.2 Information Processing in the Brain . . . 40

(4)

5.3 Differences of the Paradigms . . . 40

5.4 Critique of Modularity . . . 43

5.5 Critique of Predictive Processing . . . 46

5.6 Integration and Outlook into the Future . . . 47

(5)

1 The Evolution of Language

It is reasonably apparent, that human language use is unique on our planet. It is less clear, however, why language exists after all. To investigate language and its apparent uniqueness to the human species, it is necessary to correctly capture the actual func-tions of language. The following section provides a multidisciplinary overview of the evolutionary history of language and the computations underlying it.

1.1 What is Language Good for?

Many different conceptualizations of language’s function prevail in modern day literature. Számadó and Szathmáry (2006) mention 16 different functions for language, where all but one propagate the view that language is a tool for communication. This claim, however may form a crucial pitfall when examining the origin of language. Another theory about the function of language is that it functions as a tool for exercising mental thought (Vygotski, 2012). The development of language requires requires surpassing the mere purpose of communication, because, if communication would be the sole purpose of language no need arises for it to develop at all. Communication is ubiquitous in the animal kingdom, regardless of systems based on memories of certain sounds (as e.g. in non-human primates), or more elaborated vocal learning communication systems where an actual manipulation of sound production is performed (e.g. humans or songbirds) (Snowdon, 2009). The prerequisite enabling a species to acquire elaborated tools for communication is not even restricted to mammals, as one can see with the example of songbirds. While the proficiency in vocal learning is seen as a continuum, this continuum does not follow the gradual course of evolution across species (Petkov & Jarvis, 2012).

Thus communication seems not to be a trait favoured by natural selection, so should this form the only force driving language development then – paradoxically – the need for language disappears since more traditional forms of animal communication would suffice. The most reasonable interpretation however is that language ultimately serves both ends. It is undeniable, that language as a tool for communication brings about evolutionary advantages over comparably simpler forms of animal communication. The

(6)

TALKING LANGUAGE 2

level of abstraction applied in human language use enables communication over distal entities in space and time. Thereby, complex foraging patterns can be forged. Abstract communication about the volatility and location of food sources in the environment can take place which prove beneficial for a group but are too big or dangerous to be approached individually (Hockett & Hockett, 1960).

Language helps to afford the uniquely human ability to think and entertain rep-resentations of directly observable objects as well as abstract concepts (Lupyan & Clark, 2015). Radical enactivism on the contrary argues against the claim of language as a necessity for structuring thoughts. According to Munz (2002) embodied theories – where a theory constitutes an organism in its environment – are not expressed in language, but the crucial factor making an organism and the environment part of its subjective reality are anatomical features and its interaction with the environment. This may be taken in a sense, that an organism itself contains information about its environment. In the case of a human e.g. the location of the human together with e.g. the length of the humans’ arms define features of the environment such as “graspable objects” (Munz, 2002). Thereby the need for representational accounts of language vansih, since it is a situational phenomena manifesting itself in reality as an interplay of organism and environment.

A radically different, more pragmatic view on language use is proposed by H. H. Clark (1996). He argues that human language use is a form of joint action. Language serves the sole purpose of getting things done as a process of two or more entities acting together well coordinated. For example, a teacher presents material to his students and they listen, so it is a joint action between a group and a single individual, or in a business negation language is the joint action between two individual parties.

The remainder of this work will handle a definition of language as both a tool for exercising mental thought and a tool for communication.

1.2 Why did Language Evolve?

Language used for communication is subordinated to the language of thought en-abling us to represent the world and ultimately creating our own realities from those

(7)

thoughts. Regardless of the stance taken towards enactivist or representational accounts, both entities manifest themselves in the subjective reality of a subject. This representa-tive and symbol forming ability ultimately yields to the faculties of internal thought and planning (Jacob, 1982). Adapting this point of view renders the sophisticated communi-cation device applied by humans, as an applicommuni-cation of the originally acquired unique tool for thought (Berwick & Chomsky, 2015).

For some reason this tool for thought manifested itself in the human evolutionary heritage. A straightforward answer of why that happened unfortunately lies beyond the possibilities of empirical science. Evidence about precursory aspects of language is missing, rendering the interpretation and identification of those aspects speculative (Lewontin, 1998).

1.3 How did Language Evolve?

Even though Lewontin (1998) raised concerns about the epistemological possibili-ties of language research others took considerable effort in explaining how language may have evolved. Those necessities for language acquisition are present amongst all humans across the globe, regardless of race, gender, nationality or mother tongue. The critical period for language acquisition does not differ across humans as well (Hauser, Chomsky, & Fitch, 2002). This section elucidates the evolutionary course of human language and its utility as described in the previous sections.

Language did not evolve completely isolated from other capacities, rather co-evolution of several capabilities fostered the uniqueness of humans in many different ways. Many more far ranging concepts such as creativity, imagination and mentalizing were affected by the evolutionary course of language. Those capacities in turn fostered the development of more recent constructs such as religions, sciences, social conducts and practices, which ultimately helped to built coherent social groups and societies. All the above considerably contributed to the success of humans as a species (Berwick & Chomsky, 2015). The whole of those constructs typically fall under the umbrella term “human capacity” distinguishing humans from every other life form on earth.

(8)

TALKING LANGUAGE 4

A study by Hermer-Vazquez, Spelke, and Katsnelson (1999) proposed set forma-tion as a basic necessity for the emergence of language. Binding two or more feature elements together in a meaningful way is a first step towards structuring and commu-nicating (abstract) thoughts. They tested whether adults, pre-verbal infants and non-verbal-learning primates are able to combine features in a non-verbal visual search task. The result was that monkeys and infants perform similarly, while older humans perform considerably better. The emergence of the capability of set-formation thus provides a viable candidate for natural selection. In this light, set formation can offer advantages in planning, reasoning and navigation - a viable candidate for being fostered by natural selection. This study nicely demonstrates the reasoning away from communication as primary aspect of investigation towards more abstract aspects for the basis for language acquisition.

The capacity to imitate is another informative candidate mechanism related to the evolutionary history of language acquisition, affording the ease of copying language use from other species members. Imitation is shared amongst vocal learners such as dolphins, song-birds and humans but is absent in evolutionary ancestors closer to those species. Since vocal learning and non-voluntary sound production is a necessity for lan-guage use, other underlying traits common amongst vocal learners are feasible candidates for language precursors. The mirror neuron system alone seems necessary but not suffi-cient for imitation as can be seen with chimpanzees. Consequently the neural substrates for those supposed precursors have to exceed the mirror neuron system (Hauser et al., 2002). Neurocognitive correlates of imitation afford activation in the pre motor cortex, the motor cortex and Broca’s area in the case of goal directed behaviour (Koski et al., 2002; Nishitani & Hari, 2000). All those areas serve far ranging tasks in executing and initiating movement. Activation was also found in the left STS which is a hotspot for social information (Iacoboni et al., 2001). Interestingly, especially in the case of lip form observation - directly connected to verbal imitation - activation in formation, production and observation only differed by involvement of the occipital lobe for the latter (Nishitani & Hari, 2000). All activations during imitation suggest a recruitment of more domain

(9)

general mechanisms for the development of the capacity to imitate. If the argument holds true, that imitation is needed for language learning, then evolutionary changes in those regions and their connectivity fingerprints to other areas are viable candidates for form-ing a neurocognitive precursor of language development considerform-ing the most pronounced capacity for imitation amongst humans. A. Lieberman, Cooper, Harris, and MacNeilage (1962) argue with the classic motor theory of speech perception that speech is rather the identification of gestures of the vocal tract than the analysis of sound patterns they gener-ate. This view falls in line with the importance of gestural imitation in language learning. Being constantly revised and empirically tested, there is mounting evidence, that this is not completely, but partly true. The most recent revision of the theory by Galantucci, Fowler, and Turvey (2006) still support the claims, that perceiving speech is perceiving gestures and that speech perception involves motor system activity. However the notion rendering speech special is discontinued, since perception of random non-speech sounds activate speech-perception like patterns of activation (Fowler & Rosenblum, 1990).

The remarkable homogeneity in the acquisition of those traits across the globe, dates the evolutionary development of language acquisition fairly accurate somewhere before the African exodus of the human race - roughly 60.000 years ago (Lenneberg, 1967). While this fact establishes the upper boundary, the lower boundary has to be dated to the appearance of the first modern humans in Africa - approximately 200.000 years ago (Pagani et al., 2015) - leaving a comparably narrow time window of about 140000 years. It is reasonable to assume the existence of genetic and morphological proxies of language, but evidence about our evolutionary past is scarce and finite conclusions are vague. A well known fact though, is that the emergence of new technologies and tools (in this case for thought) precedes the emergence of new species and evolutionary change considerably. This lag in emergence of new traits and tools makes it nearly impossible to date genetic changes giving rise to the faculty of language (Tattersall, 2008).

(10)

TALKING LANGUAGE 6

2 The Faculty of Language

Now that the evolutionary origins became clear and several accounts of its func-tion were explained the following secfunc-tion deals with the composifunc-tion of language. I set out to explain the elementary building blocks used for understanding and producing lan-guage. The section closes with computational properties making language special and crucially distinguishing it from animal communication. After an introduction of those building blocks with respect to language production, perception and syntax the following sections review those necessary building blocks under the light of a Predictive Processing framework and the massive modularity account rooted in evolutionary psychology.

Hauser et al. (2002) handle two different definitions of the faculty of language and distinguish between broad and narrow (see Figure 1. The broad definition renders language as a holistic concept with different functions, such as planning, structuring thought, communicating and thinking, whereas the narrow definition includes only a computational device giving rise to the globally homogeneous capacity for language ac-quisition, being the only uniquely human part of language. Thereby the narrow notion disregards superficial differences in language production and perception, focussing merely on the computational principles enabling humans to use language. Since the topic of this review is language, not the unique capability per-se I will handle a broad definition of the faculty of language throughout the rest of the article. After all, language is ubiquitous in the human environment with all its functions and mechanisms and not restrained to one special characteristic.

(11)

Figure 1 . shows the composition of the broad and narrow Faculty of Language (FoL) after Hauser et al. (2002), where the broad consists of three building blocks, the Sensory-motor System for production and perception, the Intentional-Conceptual Interface, giving meaning to concepts and the Computational Hub establishing syntax. Only the latter is included in the narrow definition, being the only uniquely human part.

2.1 The Sensory-motor System

In order to perceive and produce language we need some kind of connection to the outer world. This connection manifests itself in the form of the sensory-motor system. To be able to understand spoken language we have to hear it and to be able to understand sign language we have to see it, while Braille is felt. The sensory-motor system with regard to language is thus not limited to a single modality, but incorporates input from a variety of senses. The same is true for producing any kind of communicative output through either the larynx or hands forming gestures - more generally body movements. The system however, fails at interpreting complex human language. After all it only transforms concepts into an appropriate format for others to understand them such as

(12)

TALKING LANGUAGE 8

e.g. English. In this sense, the sensory-motor system mainly serves as a communication tool and although necessary it is not sufficient affording language use. Record input and producing output render it as a device not concerned with underlying computations giving structure and meaning to the produced or perceived elements (Hauser et al., 2002). In humans the sensory-motor system is grounded in caudal regions of the frontal cortex, affording motor related activity by projecting into and receiving connections from auditory and visual cortices in a modality independent manner (see Figure 2 a) for a representation of sensory-motor system activity and its according location) (Rizzolatti & Luppino, 2001).

Striking functional and structural similarities can be found between the brains of song birds and humans in their externalization systems for speech and sound. The sensory-motor system of vocal learners is equipped with strong projections from the motor areas towards vocal neurons in the brain stem directly connecting it with the larynx nerve. This enables vocal learners to modulate speech sounds. Those projections are absent in non-vocal learners rendering their sound production as less controllable and more automatic – despite a 600 million year long gap in evolutionary history (Pfenning et al., 2014).

2.2 The Conceptual-intentional Interface

The conceptual-intentional interface constitutes the second building block of the broad faculty of language. It consists of the interface where meaning is given to the constructs learned from the environment (Hauser et al., 2002). Speculations about the exact nature of word-world binding remain vague and could fill a whole other article or even book since this is not the focus of the current work hypotheses will be held relatively short here.

The notions of a mental lexicon where words and their according meaning are stored on a kind of hard drive is implausible from a modern neurocognitive perspective. A prevailing idea is that language actively shapes mental states and perceptions similarly to non-language input (Elman, 2009). Further evidence in favour of this view comes from

(13)

Figure 2 . a) shows activity in the sensory-motor system and its location (white arrow) adopted from Pascual-Leone, Amedi, Fregni, and Merabet (2005). b) Shows activity in the face of concept retrieval (pink) and syntax processing (yellow/pink); functional dissection between those two roles in language processing is not yet achieved adopted from Grodzinsky and Friederici (2006) c) shows a holistic inventory of language architecture with pathways and respective areas. An interplay between those areas along a dorsal and a ventral pathway ultimately afford the emergence of language with all its properties adopted from Berwick, Friederici, Chomsky, and Bolhuis (2013).

an fMRI study, investigating differences in activation between a neutral and a condition where participants have to attend to vehicle or human motion while watching videos demonstrating neural representations of the attended action concepts (Çukur, Nishimoto, Huth, & Gallant, 2013). Those findings demonstrate embodiement of concepts, where language activates similar neural patterns as other types of input in areas corresponding to the imagery, production or perception of concepts (Lupyan & Bergen, 2016). Those ideas favour the notion of parallel processing in the brain as propagated by e.g. Mars

(14)

TALKING LANGUAGE 10

et al. (2009a) and challenge the idea of functionally isolated parts acting upon a special kind of input.

On the grounds of just mentioned theoretical considerations of the problem, the conceptual-intentional interface constitutes more of a thought experiment, than any real observable hub of hard-wired information in the human brain. It nevertheless forms a crucial part of language use so its theoretical considerations are justified on the basis of necessity (Hauser et al., 2002). The implementation of an interface affording communi-cation between brain patterns and the names of respective concepts is best thought of as a kind of hub in the brain, which serves concept retrieval, a viable candidate is BA44 in the temporal lobes (see Figure 2 b))

2.3 The Computational Hub

The last element of the broad faculty of language is the computational hub, giving rise to central structural properties of human language. Hauser et al. (2002) regard it as the single uniquely human part of language. Performed computations distinguish human language from other forms of animal communication establishing its uniqueness. Therefore syntax related activity is paramount for implementation in the brain. Such activity with regard to syntax is found in the Frontal Operculum, the anterior STG, Broca’s area and BA44/45 (Grodzinsky & Friederici, 2006) (see Figure 2 b)). It is already visible, that some of that activity overlaps with activity afforded by the intentional-conceptual interface, therefore activity giving rise to language in all its parts is only afforded by the network shown in Figure 2 c) until the respective roles of nodes inside this network are reasonably well dissected. Therefore this last part,implementing a special structure in human language is discussed in more detail. In the following the unique aspects are presented followed by the proposition of a basic mechanism called “Merge” rooted in the so called strong minimalist hypothesis (Chomsky, 1993).

2.3.1 Recursion. The most prominent feature of human language, absent in all other forms of communication is recursion. Language is a computationally finite system (the resources as well as the computational mechanisms are limited) yielding to

(15)

a discrete infinite array of expressions. Discrete infinity here refers to the fact, that language always relies on whole words as building blocks (e.g. there could be a 5 word or a 6 word sentences but no 8.5 word sentences). To make this idea more concrete, what is essentially at the heart of human language is the unique ability of constantly extending given expressions (e.g. with the help of relative clauses) (Fitch, Hauser, & Chomsky, 2005).

2.3.2 Structural vs. Linear Distance. Another unique characteristic of syntax is that structural distance is more important than linear distance. It does not matter how far apart two related words are in order to be interpreted quite perfectly by adults. Structural distance matters more in resolving ambiguities (Crain, 2012). For example the sentence “She said, Daisy ate a biscuit” is clearly unambiguous, since eating has to refer to Daisy and not to “She” ; When a pronoun comes before another one or a potential place holder - like “Daisy” - the two words are not allowed to be linked. However, “Daisy said, she ate a biscuit” is more ambiguous, because “she” precedes “Daisy” but the reference is still clear; “ate” refers to “she” and “she” could be someone else! An example for truly ambiguous structure offers the sentence “While she ate a biscuit, Daisy prepared cake”. Here the she could refer to “Daisy” even if it precedes the name. So even with reversed linear order it is still possible for the two words to actually refer to the same person (see Figure 3).

Those characteristics distinguish human language from other communication sys-tems employed by other non-human vocal learners. Songbirds for example heavily rely on linear distance in sound production. Furthermore they can not produce an infinite variety of songs, but their repertoire is constrained and lacks a human sentence struc-ture (Berwick, Okanoya, Beckers, & Bolhuis, 2011). Before illuminating the last unique property – displacement - the concept of merge has to be explained first.

2.3.3 Merge. The strong minimalist thesis (SMT) radically strips down su-perficial unique properties of language and proposes a basic computational principle as the solemn aspect rendering language special. This sole property giving rise to the hu-man faculty of language is a computational mechanism called “Merge”. “Merge” is a

(16)

TALKING LANGUAGE 12

Figure 3 . Both sentences are up to ambiguous interpretation, in both instances "she" may refer to Daisy or someone else. Both interpretations are possible in a) and b) even though the linear structure is reversed. The only way to resolve this ambiguity is to rely on structural distance instead of linear distance, where it becomes obvious, where "she" and respectively "Daisy" refers to. Should she and Daisy lie at one structural level, resolving ambiguity would not be possible.

computationally perfect process forming non ordered sets of the most basic word like el-ements of language by combining two of those elel-ements into one bigger set, subsequently building more and more complex structures from those basic element sets (Chomsky, 2014). Noteworthy, Merge does not imply any form of structural sequence, but is merely

(17)

a tool for set formation. For example the elements “drive” and “car” can be combined to “drive”, “car” but equally well to “car”, “drive”. As long as no need for communication, planning or structuring arises, the actual order of the atomic elements for thinking does not matter.

2.3.4 Displacement. Why and how the importance of word-order diminishes can be seen by the application of displacement. Displacement refers to the fact that in human language use, some positions of words are left open to interpretation, while other positions are made final through pronunciation. Consider the following examples to elucidate the phenomenon:

Daisy eats a biscuit” opposed to “What is daisy eating?” In both cases humans instinctively link “eat” to biscuit or respectively to “What” even if the positions in the sentences differ. Following Merge for a working interpretation of both possibilities, either the proposition X = expression referring to “What” and Y = expression referring to “apple” are the same (X | Y) - called “external merge” - or X is a subset of Y (Y contains X) - “internal merge”. In the example for internal merge the operation could recursively add a part of the set creating “What is Daisy eating what” since apple and contain each other. Since natural language use circumvents this arbitrarity the second “what” is displaced to the first one. In this sense natural language adheres to computational efficiency (see Figure 4). For a demonstration of the use of external merge (EM) the example has to be extended to “Look what daisy is eating”. By adding the word “look” a new set (Z) can be formed, not being part of the old set X; Y consisting of “what” (X) and “eating” (Y). Since they are completely apart of each other external merge can combine them into [Look] [what Daisy is eating what] which is ultimately uttered as the phrase mentioned above, leaving the last “what” for interpretation since it is already pronounced once as part of the same set formed by internal merge. Displacement here refers to the ability of human language to automatically strip sentences of redundant pronunciations and place the pronounced and interpreted position according to context (Chomsky & Collins, 2001).

(18)

TALKING LANGUAGE 14

Figure 4 . shows the displacement property of language, where "Merge" creates a set X containing Apple and eat, but also what since what contains apple, so either apple or what is pronounced while the other has to be interpreted: a) shows an example where apple is pronounced b) shows an example where "What" is pronounced and "apple" is interpreted and c) shows an example where both positions would be pronounced instead of leaving one open for interpretation (Note: c) is not used in natural human language).

(19)

3 Modularity of Mind

Now that I sketched an evolutionary history of the faculty of language alongside with its constitutional parts, an examination of those parts will follow for the sake of entangling the ontology of human language use. The current section is thus reserved for reviewing the building blocks of language in the context of the massive modularity assumption rooted in evolutionary psychology. The preceding section follows the more recent account of predictive processing with a similar structure. For clarifying the def-inition of modularity of mind the first part of this section serves as introduction of the paradigm, followed by an application of the theoretical implications projected onto the building blocks of language.

3.1 Evolutionary Psychology

Evolutionary psychology aims at discovering processes which shaped the human mind throughout evolution, thereby entangling its architecture. Behavioural implica-tions are surmounted on findings from evolutionary biology (Cosmides, Tooby, & Barkow, 1992). The mind is viewed as an information-processing system receiving certain inputs from sensory organs and transforming them into symbolic representations (outputs) upon which can be acted behaviourally. Adherents assume that the human mind consists of mechanisms shaped and propagated across generations through natural selection. The notion of natural selection implies a gradual advance guaranteeing success in terms of de-velopment, survival and reproduction of our evolutionary predecessors. These so-called behavioural modules are computational mechanisms evolved over time and shaped by natural selection (Frankenhuis & Ploeger, 2007). Thus, Evolutionary Psychology (EP) aims at discovering and understanding the architecture of the human mind and brain, from an evolutionary point of view (Cosmides et al., 1992). Examples of such pro-cesses are social exchange and altruism(Cosmides & Tooby, 1992, 2005), kin detection (D. Lieberman, Tooby, & Cosmides, 2007), face recognition (Duchaine, Yovel, Butter-worth, & Nakayama, 2006; McKone, Kanwisher, & Duchaine, 2007) or language (Pinker, 1994; Pinker & Bloom, 1990)and numerosity (Cohen & Dehaene, 1996).

(20)

TALKING LANGUAGE 16

3.2 History of the Modularity of Mind

Evolutionary Psychology is thus concerned with the adaptive value of cognitive processes. A prominent stream inside EP is to regard these processes as separate modules constituting the mind. The modularity hypothesis states, that the mind consists of a collection of functionally specialized mechanisms - called evolved modules. Modules are defined as specialized neurocognitive mechanisms developed in order to solve adaptive problems our evolutionary ancestors faced (Frankenhuis & Ploeger, 2007). Conceptions and definitions of modules however changed throughout the history of the paradigm of modularity.

Therefore a history of the development of modularity of mind is sketched beginning with Jerry Fodor’s influential book “The modularity of mind” (1983) up to modern day conceptions of the massive modularity hypothesis.

The early ideas restricted the scope of the theory to peripheral (lower level) cog-nitive processes only. The possibility to investigate higher level processes, requiring con-sciousness and the ability for concepts to be stored in mind (such as working memory, attention, language or mental imagery), is completely negated (Frith & Dolan, 1996). In order to understand the theory and its limitations it is necessary to sketch the concep-tual built-up of the mind from a fodorian point of view: The mind consists of peripheral and central processes - only the peripheral ones adhere to the definitions of modularity. Firstly, the world around us is passively recorded by transducers (e.g. in the form of light- or sound-waves, or ˆhaptic sensations), which compose any kind of sensory organ (e.g. the eye, the auditory tract or the skin). The peripheral (or input) systems then receive input from those transducers and transform this raw input into a format appro-priate for further computation. Those further computations happen in central systems, defined as the realm of thought and abstractions, where e.g. language or belief fixation happen (see Figure 5).

Fodor (1983) proposed 9 features peripheral processes have to possess in order to be deemed modular. However, the list is not exclusive, modularity is a matter of degree rather than an absolute categorization. Subsequently, the absence of some of those

(21)

features does not exclude a process from considerations of modularity. The 9 features in order of appearance are:

• 1st Domain specificity

• 2nd Mandatoriness

• 3rd Limited central accessibility

• 4th High processing speed

• 5th Information encapsulation

• 6th Shallowness of output

• 7th Fixed neural and computational architecture

• 8th Characteristic output patterns

• 9th Innateness

Although not all modules necessarily need to possess all of those characteristics, infor-mation encapsulation (5) is at the heart of the theory and is deemed necessary for every process to be considered as a module. However, knowledge in the field of cognitive neu-rosciences advanced considerably, which in turn fostered theoretical advances as well. The notion of modular peripheral systems gradually advanced towards considering more central systems as modular as well, ultimately resulting in the massive modularity hy-pothesis. A plead for a holistically modularity of the human mind emerged. I will explain those characteristics further where needed in order to gain a well understanding of the modern definition of a module, for a thorough explanation of all characteristics see Fodor (1983).

3.3 The Massive Modularity Hypothesis

The most radical change since the introduction of modularity is the shift towards a completely modular mind, away from solemnly peripheral views of modularity. Fodor

(22)

TALKING LANGUAGE 18

Figure 5 . shows the architecture of the mind and their interplay from a massively modular point of view. Transducers record the world around us, the Sensory-motor System transforms this raw input into an interpretable format for higher order cognitive processes, which in turn further compute the input and sent it back to the

Sensory-motor System to produce meaningful output.

(1983) acknowledged the impossibility of researching central systems on grounds of the impossibility to understand them at all as his “First Law of the Nonexistence of Cognitive Sciences”. This law states that the more global/isotopic (e.g. higher-level) a process is the less anyone can understand it. Nowadays, considerable advances are made in understanding those central processes and the infamous “black box” of the human mind gradually becomes more transparent.

Carruthers (2006) marked a radical turning point in the modularity debate by denying the necessity of information encapsulation – formerly the most crucial point for modularity. More specifically, he split up the rigid notion into narrow-scope and wide-scope encapsulation. Only 4 characteristics of the 9 original fodorian aspects remain

(23)

important, namely:

1stInformation Encapsulation, originally referring to the fact that modules operate

only on a very restricted set of input formats, limiting the resources a module can draw from. Reversely, encapsulation can be viewed as limited central accessibility, preclud-ing computational processes from access to conscious thought and introspection (Fodor, 1983). More recent interpretations apply to different views of encapsulation, namely narrow and wide-scope. Narrow scope encapsulation here refers to a system incapable of utilizing information located outside of it. The wide-scope notion however allows a system to draw on external information during the computational process it carries out. 2nd Dissociability, or put differently characteristic breakdown patterns (Fodor,

1983). The term should not be taken in a common local sense, because of the highly parallel nature of information processing in the brain (Mars et al., 2009b). While a module has to be implemented somehow spatially, neural correlates sub-serving a task are rarely found at one specific location. The term functional dissociability is thus more appropriate, referring to a characteristic and relatively well isolated breakdown in performance.

3rd Domain-specificity marks a characteristic we find within functional

dissocia-bility. A module only operates within a specific domain - a domain resides in a certain modality. Face recognition, for example resides in the modality of vision (Fodor, 1983). Thus the range of different input formats for which computations by a module take place at all are strictly restricted (Carruthers, 2006; Samuels, 2006).

4thMandatoriness or automaticity render a system as modular if the computations take place without exercising conscious thought. Once the process is initiated it cannot be stopped intermediately and is carried out reasonably fast and precise (Bargh & Chartrand, 1999). It can be regarded as an automatic processing pipeline activated by encountering the system’s specific input format. Along the pipeline there are no ways to stop the input from being processed until the output is reached. The process cannot exit along the way. A car entering a tunnel may serve as a metaphor: The car always has to drive to the end of the tunnel before any deviations from its original pathway are possible. A compelling piece of evidence is the perception of a 3 dimensional scene: Theoretically an

(24)

TALKING LANGUAGE 20

image reaches the retina as an assembly of 2 dimensional shapes and colours. Nonetheless it is impossible to perceive our surroundings solemnly as hues and shapes but different aspects of the picture are automatically integrated into a coherent sensation (Zeki & Bartels, 1999).

Throughout the history of modularity, researchers applied several definitions of modules and their according features. Since the topic of the current review is language -surely a higher-level cognitive process - the according building blocks (see section 1) will be investigated under the light of the most recent definition of modularity including high-level central processes. Despite not clearly naming them modules the 3 building blocks of the broad faculty of language are regarded as necessary and sufficient mechanisms giving rise to human language use (Hauser et al., 2002). Based on this definition I discard the notion of a single language module. Instead language emerges through the interplay of the three building blocks.

The recent notion of those three building blocks may change our conception of the constitutional parts of language. However, arguments in favour of innateness reside within two main arguments and refer to language in general: Firstly, a critical period for language acquisition exists, as can be seen with savage children studies. For example the infamous “Genie” growing up in total isolation failed to fully acquire proficiency in a first language, while rudimentary language processing abilities and forms of syntax developed (Curtiss, 1977), The notion of a critical period for language development before cere-bral lateralization is completed in development (Lenneberg, 1967), make a strong claim in favour of the innateness of the capacity to learn language. Secondly, the infamous poverty of the stimulus argument. Chomsky et al. (2006) claims that natural environ-ments children are exposed to are neither informative nor rich enough to successfully extract all features of a first language. Furthermore, he notes that the typical child di-rected speech is too poor in syntactic structure to successfully acquire syntax. Therefore this rudimentary module input serves as a triggering input for the innate capability of language use to emerge.

(25)

all its constitutional parts, but since modern conceptions of the modularity discard the notion of one single module sub-serving all tasks of language I will investigate the three constitutional parts mentioned by Hauser et al. (2002) throughout the rest of the article.

3.4 Modularity of the Sensory-motor System

In how far can the sensory-motor system be considered a module? Before investi-gating this issue, I want to clarify that the sensory-motor system as a whole is responsible for perceiving, sensing and acting upon the world around us. Functional specificity in a modular sense is therefore not given as such. However, since the topic of the current review is language the ample mechanisms of language perception and production ought to suffice as investigative object. Therefore I handle only those aspects in this paragraph. Language perception defines information encapsulation to some degree. The set of interpretable input formats should be limited to language and should not be confused with other types of perceptual input, as random sounds, music or pictures. The sensory-motor system does act on that input as well though and is also deemed active during action understanding and decision processing therefore information-encapsulation with regard to language is hard to defend (Rizzolatti & Luppino, 2001). It is at best wide scope encapsulated, because during language perception it is necessary for the system to draw on sources of information outside of the realm of language perception, in order to interpret language in a meaningful way. Therefore it is necessary to recruit memory retrieval processes in order to recognize perceived words. The same holds true for producing language. In order to utter certain words we have to draw on experience and memories; For example when recalling a story or trying to receive the meaning of a scarcely used word.

Dissociability of language perception and production has been demonstrated. Wer-nicke’s (receptive) aphasia is a characteristic breakdown pattern. Patients with this dis-order face Wenickes’s area – located in the anterior temporal lobe- malfunction (Binder et al., 1997). Different forms of Wernicke’s aphasia exist but all are marked by an absence of perceptive or motor deficits, while language production and perception is affected.

(26)

TALKING LANGUAGE 22

Speech is characterized as fluent and intact but lacking meaning, without having difficul-ties to retrieve words. Furthermore syntactic structure remains intact and patients are typically unaware of their poor language abilities (Ellis, Miller, & Sin, 1983). Wernicke’s aphasia impairs perception as well as production. Even though breakdown patterns are characteristic to language deficits this finding pleads against viewing the sensory-motor aspect of language as two different modules (one for production one for perception).

The third criterion for modularity, functional specificity is seriously challenged on grounds of Wernicke’s aphasia symptoms, concerning production and perception. A modular process is domain-specific, thus operations act upon limit input formats. In the case of language the input formats may be relatively restricted. However, it was noted that a domain resides within a certain modality. Language seriously challenges the notion of domain or modality restrictions. Perception, as well as production are two non-dissociable functions. Additionally they reside in different modalities as can be seen in the case of spoken, written, seen (in the case of sign language) or felt (Braille) language. Besides that language may also be produced either vocally or with gestures. The fact that input from all those modalities are processed similarly in the brain, namely in the language centres Wernicke’s and Broca’s area - residing in the temporal lobes -, also violates the notion of restricted input formats of a system (Constable et al., 2004). Furthermore, functional specificity is challenged in the case of Broca’s area (BA44), where neural correlates for language, tool use but also motor sequence planning are shared (Clerget, Badets, Duque, & Olivier, 2011; Higuchi, Chaminade, Imamizu, & Kawato, 2009).

Considering automaticity planning, speaking and movement formation happen automatically and we cannot describe how those outputs are formed. We can also not consciously decide not to understand a word or a sentence in a reasonably sufficient lan-guage. The fact that Wernicke’s aphasia patients are typically unaware of their problems shows that the processes are excluded from conscious investigation (Marshall, Robson, Pring, & Chiat, 1998). Thus the notions of mandatoriness and automaticity are con-firmed.

(27)

In sum it can be concluded that the sensory-motor system as whole does not adhere to modern conceptions of modularity, due to its multi-modal in- and output and its involvement in other tasks than language, such as for example, motor planning and internal representation of movement (Rizzolatti & Luppino, 2001), the imitation of movements (Iacoboni, 2005), as well as the production of voluntary movements through supplementary motor areas (Hallett, 2007).

3.5 Modularity of the Conceptual-intentional Interface

The second building block of the broad faculty of language serves the task of giving meaning to objects and constructs. Its task is thus to interpret words and retrieve meaning. The earlier mentioned analogy of a hard drive storing words and their meanings is implausible. As mentioned earlier, not much is known about the exact process with which the brain solves this task, nor about the evolutionary origin of word-object - or for abstract constructs - word-world pairings. However, the undeniable fact that humans do this renders the investigation of such a process in the light of modularity as feasible.

Modern proponents of embodiment research acknowledge the fact, that action words elicit activation in the appropriate motor areas (Pulvermüller, 2013). A word may elicit a certain pattern of activation which ultimately leads to recognition or production of this word (Jeannerod, 2006). Toni, De Lange, Noordzij, and Hagoort (2008) explored activation beyond action words in language processing, proposing a view of more exten-sive activations beyond the sensory motor system. In their point of view, activation of categories and concepts relies on the respective characteristics of words. While tools for example pronounce functional importance (Warrington & Shallice, 1984) other categories such as living things pronounce visual characteristics (Warrington & McCarthy, 1987). Even inside those categories the weight assigned to dimensions of functional or visual importance varies as a result of object meaning or context (Toni et al., 2008).

Words are thus activation patterns of congruent sensory and motor, but also per-ceptual information which are active when those words are transformed into action or perception is given meaning. Therefore even though the system can draw information

(28)

TALKING LANGUAGE 24

from a wide array of different brain regions, its input formats are restricted to those word patterns. In a wide sense it is thus properly encapsulated from other systems. The first notion of modularity is thus reasonably fulfilled.

In order to fulfil dissociability requirements the breakdown pattern may be per-fectly defined as a behavioural difficulty in retrieving word meanings. Exactly this pattern of deficits is found in anomic aphasia where patients display difficulties in retrieving word meanings, while other cognitive abilities remain intact (Dronkers, 2010). The disorder is marked by damage to the left articulate fasciculus, an interconnected network of brain regions extending from the left temporal to the left parietal lobe (Woollams, Cooper-Pye, Hodges, & Patterson, 2008). Another viable candidate for a characteristic breakdown pat-tern is semantic dementia, marked by an inability to retrieve meaning and concepts, due to prefrontal coretx and anterior temporal lobe disfunction, paralleling the sensory-motor system (Martin & Chao, 2001).

Its function is also clearly domain-specific. It is a device giving semantic meaning to objects, actions or constructs in a proficient language. However, it does not reside in a single modality and the whole of language perception and production processes may draw on its output. It thus can be regarded as a hub transforming brain activation patterns into the ontological reality of a language, giving that sufficient proficiency is guaranteed. Lastly, automaticity and mandatoriness are given as well. Importantly, word-world pairing may not be confused with word retrieval. Engaging in word retrieval may fail but we may not fail to know what a certain word means, given that the meaning of the word is known.

Summarizing those findings, it becomes clear that while adhering to some notions of modularity such as domain specificity and automaticity the conceptual-intentional interface still falls short on the notion of restricted input formats as can be seen in the case of language perception of different categories. Empirical findings about embodiment of concepts (Jeannerod, 2006) and the multi modal activation of non-action words relying on conceptual and spatial structure (Jackendoff, 2011) argue against modularity. The output following those activation patterns (imagery, movement or planning) (Zwaan, 2014) serves

(29)

as detrimental evidence to the conception of the conceptual-intentional interface as a module as well.

3.6 Modularity of the Computational Hub

The computational hub giving rise to the structure of language is considered the only uniquely human element of language (Hauser, et al., 2002).

In terms of information encapsulation it may hardly be regarded as modular, after all it is unknown how exactly its inputs - the elementary basic blocks of meaningful units, computed by merge – look like. Furthermore the system’s outputs are ubiquitous. It gives rise to structure in terms of language as well as other higher cognitive functions such as planning, structuring thoughts or imagination. It may be regarded as the mechanism giving our whole cognitive apparatus some structure on which language ultimately mirrors an instance of this structure. The same way as could be seen with the set formation example, where the process grants humans an advantage in a visual search paradigm (Hermer-Vazquez et al., 1999).

Dissociability in a sense that humans completely lack the ability to structure their thoughts is not given in the case of the computational hub manifesting itself in syntax. Even though breakdown patterns characterizing an inability to structure language exist in different forms of dyslexia, other basic computational principles stay intact in these patients (Siegel, 2006).

In the case of domain specificity, two different stances may be applied. Firstly, the stance that grammar and syntax structure are used in all kinds of modalities, pleading against domain-specificity. In a more computational sense, the basic property of language, namely recursion is not only seen in language itself but also in natural numbers. The basic principle is thus not only reserved for syntactic structure but also affords numerosity (Xu & Regier, 2014). Both stances however, speak against domain specificity of syntax. The last characteristic of a module is partly fulfilled in a way that application of correct grammar use is automatic once it is learned. However, the process is not com-pletely mandatory since we can perfectly well produce sentences with incorrect grammar

(30)

TALKING LANGUAGE 26

even though that may exercise some additional effort.

To sum up findings about the modularity of the computational hub it becomes clear that this uniquely human part also falls short with respect to modern notions of modularity. While the basic principles of computing syntax stay intact even with a set of dyslectic disorders, the basic recursion principle may also be applied to other recursive systems. The conception of the basic computational principle carried out by the hub as a set formation tool may also not be restricted in its scope of outputs and inputs and finds application in different domains. Therefore the computational hub may not be seen as strictly modular as well.

To conclude this section I want to summarize the notions of modularity with respect to the building blocks of human language. The two systems responsible for language as a tool for communication partly adhere to the notion of the modern massive modularity hypothesis. However not all criteria are met constantly and adherence to modularity heavily depends on the interpretation of what exactly constitutes a module. In a broad sense the faculty of language may thus be regarded as a system made up of several modules. Problems with that view arise with the notion of a computational hub giving rise to syntactic structure. Especially with the notion of language as a tool for mental thought (Berwick & Chomsky, 2015), it becomes increasingly difficult to define all parts of the faculty of language as strictly modular. The cross modal nature of language poses a threat to modularity conceptions.

The following section will regard the faculty of language from within the radically different framework of predictive coding which tries to throw the conception of modular-ity completely astray by proposing a computational principle unifying brain function – thereby notions of functional specificity become superfluous.

(31)

4 Predictive Processing

The following section follows a similar outline as the preceding: First the theory of predictive processing and its origins are sketched for the reader to gain a deeply grounded understanding followed by an inspection of the applicability of the theory’s accounts to the building blocks of the broad faculty of language. For clarity I will refrain here from drawing connections or comparisons to the modularity approach. The last section of the article is reserved for a detailed juxtaposition and integration of the two frameworks introduced.

The predictive coding framework gained substantial popularity during the last decade. The theory aims at explaining cognitive processes in the brain by a single computational mechanism. The most appealing nature of this approach, lies thus in its mathematical efficiency holistically explaining computations in the brain (A. Clark, 2013).

The theory gained inspiration from data sciences, originally evolved during the 1950s by James Flanagan in the bell laboratories in the US as an approach to develop an algorithm for data- more precisely sound and image - compression. The central idea was that not the whole image had to be processed, but only the discrepancy between expectancy of the following event or stimulus in time and the real appearance of that stimulus (Atal, 2006). This idea recently found application in describing information processing in the brain.

At the heart of the predictive processing framework lies the central idea stating that people always perceive and interpret their surroundings based on previous experience (see Figure 6). Stated in a more extreme way: we only perceive and process the error between expectations and perceptions occurring in our surroundings. As long as our experiences around us confirm our expectancies about the world around us , we barely have to exercise mental effort to process the continuous stream of information we are confronted with. Should the experience offered by our surroundings, however, differ from expectancies it is necessary to update our models of the world around us. In a very early stage of processing we begin to perceive what is happening in the world around

(32)

TALKING LANGUAGE 28

us. This perception is already filtered by expectations, arising as a result of experience. Should current perceptions not meet expectations, a discrepancy is registered. This prediction error - a failure to integrate information with the current model of the world – is then propagated upward in the processing stream and the brain integrates the new information. This process defines a constant updating of our model of the world, by incorporating new information that does not fit the current model. By incorporating this information future occurrence of prediction errors is minimized as much as possible. In other words, the actual world around us is constantly tested against a theoretical optimal version of the world entertained by our brains. Any kind of information processing is influenced by our experience and knowledge and therefore regarded as top-down driven (A. Clark, 2013).

Figure 6 . shows information processing as implemented in the predictive coding framework, where prediction error is propagated upward from sensory organs towards higher processes in the brain and prediction is propagated downward to adjust

(33)

The idea has been conceptualized in mathematical terms during the last decade. Information processing in in the brain is characterized as a Bayesian hierarchical gen-erative model, with bi-directional computing. Bayesian means that the initial a-priori chance of a hypothesis to be true is not zero, but its probability is changed given the per-ceived information (Chalmers, 2013). Generative refers to the fact, that the model of the world, represents and constantly updates information on grounds of previous experiences and perceptions. It captures the most likely structure of the real world. This higher-level knowledge of the structure of the world is then employed in a top-down manner to explain the perceived sensory signal as completely as possible. Subsequently only the residual er-ror in the processing stream is taken into consideration (Rao & Ballard, 1999). This erer-ror in turn is transported upwards in the hierarchical structure and guides continuous model updating at higher-levels aimed at better explaining away sensory signals in the future. This approach must be seen as a structure with numerous levels where the higher ones always generate optimal states (predictions) not only for immediate sensory experience, but also for different intermediate layers of the processing stream of input. In terms of neural models, those are the hidden units constantly fine-tuning their predisposed states in order to generate as less error as possible (Hinton, 2007).

Such an architecture is called a duplex architecture because it is taking care of two processes simultaneously, namely: Error propagation upwards in the hierarchy and prediction downward. In such an architecture, the error propagation, is handled by separated error units and the explanation of the error – the prediction-making - is handled by functionally distinct representation units. According to this architecture, there are two highly specialized information streams, operating in one of the two directions feeding into each other at the hierarchical nodes of the model (Friston, 2005). This way, the novelty of stimuli can be processed inherently as error of expectation along a separated pathway (Spratling, 2008). Different neural populations in the brain sub serve the roles of error and representation: Superficial pyramidal cells pass information up the hierarchy whereas deep pyramidal cells pass information downwards (Friston, 2005, 2010).

(34)

TALKING LANGUAGE 30

way, we do not rely on absolute values when representing any feature of any object in the world. Indeed, empirical evidence for optimality in perception can be found in illusions. The illusion resembles the adoption of an optimal computational mechanism. The limits posed upon this optimality are reached however, when the assumptions under which the mechanism performs are not reasonable enough any more. Thus in the case of the illusion, the assumptions prior experiences are based on, are violated by a new and very uncommon frame of reference, necessary for the ‘illusion’ to occur (Weiss et al. 2002). Consider, for example, the following thought experiment: When assessing the height of a building, we may not have a fixed value of that height presentation (H), rather we represent it as a probability distribution around H, encoding the probability that the real height of the building is around H given the current evidence as provided by sensory information (Knill & Pouget, 2004).

The McGurk effect constitutes a more concrete compelling example of a percep-tional illusion. It further demonstrates cross modal applicability of the effect. This auditory illusion describes the change in perception of a sound in face of a discrepancy between an auditory and visually presented stimulus. A multi modal prior changes the perception of the auditory stimulus by the expectancy which is congruent with the visual stimulus. Those multi-modal priors also arise in response to environments which are too plentiful to process a stimulus by just a single modality. For example when order-ing drinks in a noisy bar, it would be impossible to discriminate voices and words just through hearing, but seeing the faces, mouths and location of sound, guides information extraction in a bayesian way, ultimately reconstructing the original input (Wilkinson, 2014). Mounting evidence for humans as nearly optimal Bayesian agents can be found across several domains such as adoption of motor error in the nervous system (Körding & Wolpert, 2004) or memory (Kording, Tenenbaum, & Shadmehr, 2007).

To illustrate an approach to the predictive processing account of higher functions consider the example of the supplementary motor area’s role in motor planning, where predictions are made about sensory consequences of movement even before actual onset (Makoshi, Kroliczak, & van Donkelaar, 2011). In planning a motor sequence (already

(35)

quite a high level task), the Supplementary Motor Area (SMA) could be seen as an inter-mediary node of the hierarchical model getting top-down information about an abstract goal and generating in turn top-down information about the sensory consequences of the movement sequence. The same can hold true for more abstract planning, memory or retrieval processes, where intermediate nodes receive input and transform that into ex-pectation. In the case of imagery imitation or planning those expectations are then not passed down the hierarchy completely. This in turn leads to an absence of bottom-up information so the information is only entertained at intermediate nodes, before it is ac-tually transformed into observable action. Forgetting may happen along the same paths; Information is circularly entertained at top and intermediate layers of the model and only passed downwards when needed. The absence of validating or diverging bottom-up information makes the information prone to error. In the case of word-pattern activation it could be that the pattern is deteriorated to much in order to reliably retrieve a word, so we have to look it up in the dictionary to readjust internal (intermediate) representations by bottom-up sensory input.

The reminder of this section is reserved for investigating evidence of the three building blocks giving rise to the broad faculty of language following predictive processing principles. Noteworthy here is that the ultimate aim of Predictive Processing is to explain brain function holistically with one unifying computational principle, but for the sake of explanatory convenience language is not approached as a whole and integrated into the rest of human functioning. The blocks it is made of serve more as an explanatory framework in order to give the analysis structure and meaning. By mirroring the structure of the previous section the anteceding juxtaposition of the framework is easier to follow and interpret for the reader.

4.1 Predictive processing in the Sensory-motor System

There is ubiquitous evidence in favour of the sensory-motor system behaving in a predictive manner cross modally as can be seen with the aforementioned example of the McGurk effect (Wilkinson, 2014) or compelling evidence of visual predictive processing

(36)

TALKING LANGUAGE 32

as demonstrated with the Müller-Lyer illusion, where the model entertained by different populations serve as the crucial factor for the illusion to occur (Howe & Purves, 2005). However, as noted before the important features of the sensory-motor system for the current article are language perception and production.

Because the PC account postulates an interplay between top-down and bottom up processes all predictive accounts for natural language necessarily entertain the idea that language always exists in the realm of thought and communication at the same time, thereby the distinction between language as a tool for thought and language as a tool for communication disappears. . Two important concepts in searching for a predictive processing account of language are surprisal and entropy. While the former refers to the probability of a current word, the latter refers to the probability distribution of possible words following the currently processed one. So for example the surprisal of “dog” following the word “walk” is relatively low, because we have learned in the past that dogs need to go for a walk frequently. Entropy relates to the wideness of the probability distribution followed by a stream of words. For example: “To maintain a fit and healthy lifestyle, I frequently...” has quite a high entropy, since there are a lot of possibilities to achieve that goal, however the sentence “Whenever I take the leash my dog knows we are going for a...” has a relatively low entropy because most people take their dogs for walks and the possible activities you do with a dog are quite limited. The probability to successfully complete those sentences could also be interpreted with the help of the so called “cloze probability” of words. The cloze probability is a measure of lexical predictability of a word (Taylor, 1953). The cloze probability may therefore be based on the ability of humans to detect statistical regularities in everyday language, however the detection of those regularities may again be interpreted as top-down driven signaling resulting from bayesian computations in the brain.

Willems, Frank, Nijhof, Hagoort, and Van den Bosch (2015) indeed found fMRI signal correlates of surprisal and entropy in spoken language in a wide range of cor-tical areas, demonstrating the predictive nature of processing spoken natural language at different levels of interpretation beginning with word forming. Besides that,

(37)

predic-tive processing correlates of the speech perception system recruiting speech production systems in situations of high contextual constraints are found, advocating in favour of predictive information processing in the sensory-motor system. They propose the pro-duction system takes over the role of an emulator predicting future input on as well the syntactic, the semantic and the phonologic level (Pickering & Garrod, 2007). Evidence for such an account is found in the covert imitation of tongue muscle movement when listening to language (Fadiga, Craighero, Buccino, & Rizzolatti, 2002), furthermore pro-duction related lip activity is associated with activity in Broca’s area, taking a mediating role between production and perception systems in such an emulator framework (Watkins & Paus, 2004). From a predictive processing point of view it could be argued that the production system serves as an intermediate layer adjusting mirror neuron activity to the perceived input when the context is sufficiently predictable. Thereby, computational de-mands imposed upon perception systems are smaller, because they reserve expectations of own and other’s lip or tongue movements as guidance.

Besides evidence for other tasks of the sensory-motor system, the correlates for entropy and surprisal confirm the hypothesis of predictive processing of sensory inputs with regard to language.

4.2 Predictive Processing in the Conceptual-intentional Interface

The true nature about the origin of word-world pairing will probably remain out of the scope of scientific enquiry. However since those relationships already exist and we learned them throughout our lives, there has to be some mechanism establishing and retrieving those concepts. Although research is scarce I want to propose an idea of how the retrieval of semantic meaning may be thought of in terms of predictive processing. From a predictive processing point of view, concepts are models entertained in the brain and dynamically updated upon encounters of a concept, but the theory does not preclude the notion of some innate mechanism enabling the formation thereof (A. Clark, 2013). A concept fulfils the role of error minimization in perception, thereby enhancing predictabil-ity of the world. Minimization is realized by adjusting multi-modal representations of

(38)

TALKING LANGUAGE 34

concepts acquired through associating encounters with acquired models. Encounters will improve the representative nature of a concept (updating it) through the deviation of the encounter from the hypothetical optimal model (or prototype of a concept) (Friston, 2010).

Lupyan and Clark (2015) made an additional effort to explain language at the conceptual-intentional interface proposing an active relationship between learned cate-gories and perception of instances of those catecate-gories. The framework is quite remini-scient of the strong Whorfian-Sapir hypothesis rooted in relative linguistics, stating that language determines cognitive categories as well as thought as a whole (Whorf, Carroll, & Chase, 1956). However, the aspect of prediction is added by Lupyan and Clark (2015) involving the fine tuning of perception and subsequently the finer distinction between categories through exposure. For example the many different Eskimo words for snow ac-tually shaped the perception of different kinds of snow, thereby actively shaping different concepts of snow, where the concept previously was just perceived as frozen white water (Lupyan, Rakison, & McClelland, 2007).

This finding already constitutes one hallmark of predictive processes in object-meaning relationships. The second one is the notion, that words are not stored hard-wired in the brain as on a kind of hard drive, but that activation correlates with activation usu-ally recruited by imagination and mirrored during perception – what wires together fires together. This prototype activation had been addressed by Toni et al. (2008), advocating for a decomposition of concepts into its abstract constitutional parts. Those activations, however have to be augmented with matched instances of similar structures along with non-visual sensory input and an action-model appropriate for the motor system (Barsa-lou, Simmons, Barbey, & Wilson, 2003). So for any given word (e.g. cat) we may have a prototype cat in our mind, fulfilling different characteristics (e.g. 4 legs, sharp ears, small, long wobbly tail etc.), but also furry and agile. So if we imagine a cat, first some of those characteristics form activation in our mind, leading to some kind of pattern of activation which is also elicited by witnessing a cat. However, those features ought not to suffice for being sure about our representation, so at this point the entropy of the

Referenties

GERELATEERDE DOCUMENTEN

While national-level language and education policies do influence educational practices and local discourses surrounding language in the Kumauni context, the ways in which

Measurements of the bulk elemental abundances in the gas being accreted onto the star should distinguish between chemical processing or dust locking, but it is difficult

Constrained by experimental conditions, a reaction network is derived, showing possible formation pathways of these species under interstellar

Albert Philipse (a.p.philipse@uu.nl), Chemistry Department (Utrecht University), in collaboration with the History of Chemistry Group (Royal Netherlands

Similar to Barsalou’s (1999) perceptual symbols systems, the indexical hypothesis (Glenberg & Robertson, 1999; 2000) is another theoretical framework that connects the

Tijd en ruimte om Sa- men te Beslissen is er niet altijd en er is niet altijd (afdoende) financiering en een vastomlijnd plan. Toch zijn er steeds meer initiatieven gericht op

Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of

It is also important to conduct research to assess the knowledge, attitude and practices of health professionals at RFMH as well as other health institutions