Language Contact in the Balkan Sprachbund A study of transparency in Italian, Russian, Bulgarian, Romanian and Greek

(1)

1

Language Contact in the Balkan Sprachbund

A study of transparency in Italian, Russian, Bulgarian, Romanian and

Greek

Abstract

When communicating speakers map meaning onto form. It would thus seem obvious for languages to show a one-to-one correspondence between meaning and form, but this is often not the case. This perfect mapping, i.e. transparency, is indeed continuously violated in natural languages, giving rise to zero-to-one, one-to-many and many-to-one opaque correspondences between meaning and form. However, transparency is a mutating feature, which can be influenced by language contact. In this scenario languages tend to evolve and lose some of their opaque features, becoming more transparent. This study investigates transparency in a very specific contact situation, the Balkan Sprachbund (BS), by researching five different languages: Italian, Russian, Bulgarian, Romanian and Greek. We considered two separate theories: convergence and diglossia. Following the first theory we hypothesized that convergence is the cause behind the feature sharing typical of the BS, thus predicting that Bulgarian and Romanian would be more transparent than the other two languages of the same families not belonging to the BS, Russian and Italian. Our second hypothesis, on the contrary, considered Greek to be the source language behind the existence of the BS and therefore predicting that the BS languages, Romanian, Bulgarian and Greek, would share the same features. We investigated twenty-five opacity features, divided into five categories, Redundancy (one-to-many), Fusion (many-to-one), Discontinuity (one meaning is split in two or more forms), Form-based Form (forms with no semantic counterpart: zero-to-one) and a group of typical BS features. The results prove our second hypothesis to be borne out. Romanian, Bulgarian and Greek present the same features, which points into the direction of diglossia as the underlying cause of the BS.

Keywords: transparency, FDG, language contact, Balkan Sprachbund, diglossia

1. Introduction

The function of human languages is first and foremost communication. And, in order to achieve this goal, language users must constantly transform meaning into form and form into meaning. This leads to a tight relation between two separate levels of linguistic organization: a content level and a formal level. It would, therefore, be normal to expect a perfect and transparent one-to-one correspondence between these two levels, namely the existence of one

(2)

2 form for every meaning and vice versa. This is however not the case. To our knowledge, no perfectly transparent language exists. All languages allow to some degree discontinuity, fusion and redundancies, to mention a few ways in which transparency can be violated. This introductory chapter will outline in detail the concept of transparency, comparing it to other linguistic notions such as simplicity, ease of acquisition, iconicity and regularity (1.1) and its relation to language contact (1.2).

1.1 Transparency

The term transparency has been interpreted in various ways over the years. For the sake of this research, we will adopt Hengeveld’s (2011) definition, according to which transparency in a one-to-one correspondence between meaning and form. It is nonetheless important to outline other concepts with which transparency can easily be confused: simplicity, ease of acquisition, iconicity and regularity.

Within simplicity, we can make a distinction between absolute and relative simplicity. Absolute simplicity is the simplicity of a language system as such and can be calculated by looking at the amount of surface form (surface simplicity) a language has and at the levels of embedding (structural simplicity), it needs it order to communicate a certain content

(Miestamo, 2006). In these respects, the more linguistic material and layers of embedding a language needs, the more complex it is. Relative simplicity is also known as ease of

acquisition. According to Miestamo (2006) and Kusters (2003), the easier it is for L2 learners to learn a language, the simpler that language must be. The term iconicity, on the other hand, refers to the predictability of a word’s meaning from its form (McWorther, 1998). In spoken languages, however, the relation between word meaning and form is much more arbitrary (Leufkens, 2013). Finally, another notion which is often confused with transparency is

regularity. Regularity can refer to the predictability of paradigms, such as verbal conjugations and nominal declensions for instance, as well as to the formation of compounds (Leufkens, 2013). Yet, predictability of paradigms does not mean transparency.

All these notions have something in common with transparency as defined by Hengeveld (2011), but differ conceptually from the latter. Transparency is, in fact, an

interface property between the conceptual and the formal levels, and not an intrinsic property of the language itself.

(3)

3 As any other language property, transparency is dynamic and hence subject to influences and changes. The most interesting evolutions are witnessed in situations of isolation and in those of language contact.

In the absence of language contact, languages evolve in one specific direction, namely from simple to complex (Lupyan & Dale, 2010) and from transparent to opaque (Hengeveld, 2011b; Seuren & Wekker, 1986). In such a context, certain formal units can change or lose their meaning completely and thus become opaque by undergoing a process also known as ‘maturation phenomenon’ (Dahl, 2004). Nominal expletives are the ultimate example of such a phenomenon. Often defined as ‘historical junk’ and ‘male nipples’ (Lass, 1997), they had a semantic meaning which was lost over time making the way to an empty and opaque

placeholder.

Just like isolation causes a language to become more opaque, language contact, on the contrary, may push it towards transparency. This shift might be due to a competition between different factors, such as economy and transparency. In contact situations, the need for intelligibility is extremely high, which in turn forces speakers to choose more transparent forms. As demonstrated by Kusters (2003), the more L2 learners a language has, the more transparent it becomes over time. Similar results are found by Olthof (2015) for Norwegian and Leufkens (2013) and Seguin (2015) for creole languages. Creole languages are

exemplary in this respect, as they originate in very peculiar contact situations between much older languages. The results of both Leufkens (2013) and Seguin (2015) clearly show that creoles are significantly more transparent than their source languages.

The evolution of a language towards opacity or transparency is not random. Based on a transparency study of four natural languages, Hengeveld (2011b) shows that opaque

features are not randomly distributed across languages but that the existence of certain features imply the existence of others, forming an implicational hierarchy. The latter can often be interpreted as a diachronic pathway (Greenberg, 1978). It thus follows that

languages start out relatively transparent, acquiring over time opaque features following the order of the implicational hierarchy. This is in agreement with the finding of several

typological studies, such as Leufkens (2013 & 2015), Olthof (2015) and Seguin (2015). After investigating 25 languages, Leufkens (2015) drew up an implicational hierarchy mirroring the diachronic changes in (1). This hierarchy is implicational in the sense that the existence in a language of a certain feature implies the existence of all features lower in this hierarchy.

(4)

4 (1) nominal expletives, clausal agreement

→

grammatical gender, tense copying →

suppletion →

phrasal agreement, irregular stem formation →

predominant head-marking →

morphophonologically conditioned stem alternation →

morphologically and morphophonologically conditioned affix alternation →

redundant referential marking, phonologically conditioned stem and affix alternation, grammatical relations

Every contact situation is different, just like every language born in such a context is unique. In the present study, we focus on one particular contact situation: the Balkan

Sprachbund. The following chapter will introduce the features of this special linguistic league and outline its implications for this research.

2. The Balkan Sprachbund

A sprachbund is “understood as two or more geographically contiguous and genealogically different languages sharing grammatical and lexical developments that result from language contact rather than a common ancestral source” (Friedman, 2006). The Balkan Sprachbund is a linguistic league of languages from the Balkan area. These languages are genealogically far from one another and yet share peculiar features only present within the league, which are believed to have developed thanks to constant contact. This chapter will introduce the

(5)

5 geography and history of the Balkans, the languages involved and their features and finally the theories behind the birth of the Balkan Sprachbund.

2.1. The Balkans and their history

The Balkans can be geographically defined quite easily. They are bordered by the Ionian, Mediterranean and Aegean Seas on the south, by the Adriatic Sea on the west and the Black Sea on the east. The Sava river and the Danube define its northern borders (Tomić, 2006). Politically speaking, on the other hand, with the term Balkans one usually refers to Albania, Bulgaria, Greece, Romania, “European” Turkey and the countries formerly known as Yugoslavia, namely Bosnia and Herzegovina, Croatia, Kosovo, Macedonia, Montenegro, Serbia and Slovenia (Figure 1).

Figure 1: The Balkans, source https://www.balkanpeace.org/

Among the historic events concerning the Balkan area in the last two millennia, there are some that had severe linguistic consequences. First, the Balkans witnessed Greek

influence and domination, starting as early as in the Late Bronze age (Tomić, 2006) and later with Alexander the Great. Short before AD, Rome started to become stronger and expand. Its domination in the Balkans was strong and lasted for several centuries. In the 4th century, however, the Roman Empire was split between an Old Rome, centralised, and a New Rome

(6)

6 in the east. Two centuries later the Slavs invaded the Balkans. The invasion split the area in two, drawing a line from the Sava river on the Serbo-Croatian border to the Montenegrin-Albanian border. This imaginary line divided the Catholic (west) from the Orthodox (east) sphere of influence (Tomić, 2006). When in the 14th_{century the Ottoman empire expanded}

and conquered the Balkans, it also shaped their borders as we know them today, leading the way to six centuries of Turkish domination.

2.2. The Languages

While defining the Balkans is quite simple, the definition of the Balkan Sprachbund is more difficult, as not all the languages from this area belong to this linguistic league. The

Balkan Sprachbund is a group of languages in the Balkans sharing a significant amount of

contact-induced features extending to all linguistic levels (Friedman, 2011). Even though these languages are all Indo-European, they belong to different subgroups. Traditionally the Balkan Linguistic Area (BLA) is associated with four groups: Balkan Slavic (BS), Balkan Romance (BR), Hellenic and Albanian.

Table 1: Languages of the BLA

Language group Languages

Balkan Slavic (BS) -Bulgarian -Macedonian -Torlak dialects Balkan Romance (BR) -Aromanian

-Istro-Romanian -Megleno Romanian

-Romanian (based on the Wallachian dialects)

Hellenic -Northern dialects

-Southern dialects

Albanian -Gheg

-Tosk

Romani -Balkan

-Vlax

Balkan Turkish -West Rumalian Turkish -Gagauz (to a certain extent) Jewish Languages -Judezmo

(7)

7 BS consists of Bulgarian, Macedonian and the Torlak dialects from south-east Serbia. BR consists of Romanian (based on the Wallachian dialects), Aromanian, Istro-Romanian and Megleno Romanian. Albanian consists of two dialects, Gheg spoken in the north of Shkumbi river and Tosk south of it. The Hellenic group also includes two separate

subgroups: dialects spoken in the north of Greece and those spoken in the Peloponnese. There are also four other languages that are considered to belong to the BLA status. Romani, whose Balkan and Vlax dialects are part of the BLA (Boretzky & Igla 2004), Judezmo (Altbaev 2003), West Rumelian Turkish (WRT) and Gagauz (Dombrowski 2011, Friedman, 2003b, 2006) were all excluded or ignored by one of the first experts on the matter, Sandfeld (1930), even though they all rightfully belong to the Sprachbund (Friedman 2006). Table 1 lists all BLA languages Figure 2 shows where they are spoken.

Figure 2: A map of the languages of the Sprachbund1

(8)

8

2.3. The Features

The features typical of the BLA are also known under the name Balkanisms. The first ever witnessed and reported feature was the postponed definite article (Kopitar, 1829). As of today, we know of several features, both phonological and morphosyntactic. Not all experts agree on the number of features and not all features are present in all languages. We propose here a short summary of the most important ones, following Friedman (2006) and Tomić (2006).

As far as phonology is concerned, the most relevant ones are the reduction of

unstressed vowels to schwa or to non-syllabic elements, the existence of stressed schwa and the raising of unstressed mid-vowels /e/ and /o/ to high vowels, /i/ and /u/ respectively. Most BLA languages also lack front rounded vowels, such as /y/, and show an alternation of /l/ and /ł/ before front vowels (Friedman, 2006, and Tomić, 2006). Concerning the suprasegmental features, we notice the absence of length and tone and the inability of a stress accent to move further into the word than the antepenultimate syllable (Friedman, 2006).

Among the morphosyntactic features, the lack of an infinitival form seems to be the most peculiar one, as well as the most characteristic (Friedman, 2006). As far as the nominal domain is concerned, we find grammaticalized definiteness, namely demonstrative pronouns having been encliticized or suffixed to nominals to become definite articles, resumptive clitic pronouns, also known as object pronoun reduplication or doubling, a genitive and dative merger and very poor case morphology (or complete lack thereof). Concerning the predicative domain,the most relevant features are the analytic subjunctive, the future formation with an auxiliary derived from “want/will” and the combination of a future and a past tense marker to form the conditional. Finally, on the sentence level, we find sentence-initial clitics and a relatively free word order in unmarked contexts. Table 2 below

summarises all the features and the languages they are found in (Friedman, 2006).

Table 2: Summary of the features and the languages they are present in (Friedman, 2006):

Feature Present in

unstressed vowel reduction to schwa or to non-syllabic elements -Albanian -BR

the existence of stressed schwa -Albanian

-Bulgarian -Gheg -Macedonian -Romani

(9)

9 -WRT

raising of unstressed mid-vowels -BR

-Bulgarian -Gheg -Greek -Macedonian the absence of front rounded vowels All but Albanian and

Central Gheg alternation of /l/ and /ł/ before front vowels -BS

-Greek -Romani the absence of length and tone and the inability of a stress accent to move further

into the word than the antepenultimate syllable.

All but Gheg and Tosk

grammaticalized definiteness -Albanian

-BS -BR

resumptive clitic pronouns -Albanian

-BS -BR -Greek -Romani

genitive and dative merger -Albanian

-BS -BR -Greek

poor case morphology All

lack of infinitive All

analytic subjunctive All

the future formation with an auxiliary derived from “want/will” -Bulgarian -Gheg -MR -Romanian the combination of a future and a past tense marker to form the conditional All but BS and

Romanian

clitic order -Albanian

-BR -Greek -Macedonian

(10)

10

2.4. The Theories

The Balkans saw several foreign dominations over the centuries. Every dominant population also brought new cultures and languages, which play a role in the different theories

concerning the creation of the BLA.

2.4.1 Theories on the Balkan Sprachbund

The first theory by Leake (1814) attributes the cause to a superstrate language, Slavic. Later in the 19th century, Kopitar (1829) and Miklosich (1861) claimed that the balkanisms

developed under the influence of ancient substrate languages spoken by the inhabitants on the Balkans, namely Thracian, Dacian and Illyrian. This theory became very popular in the beginning of the 20th century thanks to Weigand (1928), but is nevertheless unfounded. The material left of these languages is in fact too limited to prove any such relations (Tomić, 2006). Almost a century later, Sandfeld (1930) a.o. suggested the source of the Sprachbund to be Byzantine Greek. According to him (1930:165), the Greek influence is “the most natural explanation”, if not the only one, for all the general concordances within the BLA.

Solta (1980) and Gołąb (1984) also claimed BLA to have originated from a substrate language, Latin and Aromanian respectively. Nevertheless, both theories are not borne out: the BLA features are not found in Latin nor in other Romance Languages.

In the second half of the 20th century, linguists started to look at the Balkan

Sprachbund in a different light. Civjan (1965, 1979) was the first to propose a convergence model. The term ‘convergence’ refers to the general acquisition of structural similarities between languages (Silva-Corvalán, 1994: 4-5). According to this model, the BLA is the result of languages used as a means of communication, namely the convergence of one idiom towards the other. Civjan (1965) indeed states that the cause had always been sought in the past, namely in a substrate language. The trigger is though in the contact situation and thereby to be found not only in the past but also in the present and future. A convergence model was also proposed by Lindstedt (2000), defining this specific contact situation as a shared drift, parallel changes and not transfers from one language to another. Lindstedt also sought the cause in the multilingual environment of the Balkans.

These last two theories are both supported by the fact that the Sprachbund features are more numerous in the area where there is the highest number of co-territorial languages, namely south of lakes Ohrid and Prespa. This area sees five languages intersect: Albanian,

(11)

11 Aromanian, Balkan Romani, Greek and Macedonian (Tomić, 2006). Table 3 summarises the various theories.

Table 3: A summary of the theories on the BLA

Main Proponents Theory Supported?

1 Kopitar (1829) Miklosich (1861) Weigand (1928)

Ancient substrate languages: Thracian, Dacian and Illyrian. No

2 Sandfeld (1930) Byzantine Greek as source language No

3 Solta (1980) Latin as substrate language No

4 Gołąb (1984) Aromanian as substrate language No 5 Civjan (1965, 1979) Convergence model: languages used as communication means

(contact)

Yes

6 Lindstedt (2000) Shared drift: parallel changes due to a multilingual setting (contact)

Yes

2.4.1 Kortmann’s Theory

In his study on adverbial constructions in the languages of Europe, Kortmann (1998) witnessed a clear split among European languages on the basis of their religious sphere of influence (Fig. 3). Languages that were under the Greek Orthodox sphere of influence, on the one hand, and those under the Roman Catholic sphere, on the other, show a clearly different behaviour with regard to adverbial clauses. Thus Kortmann, as well as Décsy (1973) and Blatt (1957) before him, inferred that the official languages used by the Church, Latin and Greek respectively, must have influenced the development of the vernaculars spoken in these regions. The Roman Catholic Church, and thus the Latin language, prevails in most of Western and Central Europe, as well as in Poland. Latin remains the official language of the clergy in all these countries until the 15th_{and 16}th _{centuries (Kortmann, 1998). On the other}

hand, the Orthodox Church’s influence, with Koine Greek as its official language, extends to most of Eastern Europe, including of course the whole Balkan Peninsula. Koine Greek

(12)

12

Fig. 3: Kortmann’s (1998: 534) representation of the language division between Central al Eastern

Europe (indicated by the dotted line)

having then been replaced with modern Greek. As far as Eastern Europe is concerned, in the 9th_{century, a new liturgical language is introduced by Saints Cyril and Methodius: Old}

Church Slavonic (OCS) (Cizevskij, 2000). The latter, also known as the first Slavic literary language, was canonised based on a dialect spoken by Byzantine Slavs living in the Province of Thessalonica (Waldman & Mason, 2006). In the early 12th_{century, OCS evolved into}

Church Slavonic (Cizevskij, 2000). While Church Slavonic (CS) is still nowadays the liturgical language is the Russian Federation, every other East-European country went along a different linguistic path. Two very interesting countries in this respect are Bulgaria and Romania. In Bulgaria, there are continuous alternations between Koine Greek and CS until the 18th century. It is only in the 19th century that Bulgarian, together with CS, is finally used during liturgy (Kalkandjieva, 2014). Romania is another very interesting case, as it is the only country where a Romance language and the Greek Orthodox Church fully coexist. CS has been the official language of the Romanian clergy until the 18th century when it finally made way for Romanian (Sava, 2013).

(13)

13 This specific contact phenomenon is known with the term diglossia. Only witnessed in multilingual communities (Fishman, 1967), diglossia describes a peculiar sociolinguistic situation in which there is a clear differentiation in function between the language varieties used within the community (Matras, 2009). In such a situation there are thus two varieties: a high or prestige variety (e.g. Koine Greek), generally reserved for official functions, and a low variety (e.g. Romanian), confined to informal contexts. According to Kortmann’s theory, the reason behind the morphosyntactic development of European languages was highly influences by diglossia.

2.5. Hypothesis and Predictions

Many different theories have been proposed in order to explain the nature of the Balkan Sprachbund. They can, however, be synthesised into two main groups: the convergence theories and the superstrate or diglossia theories.

According to the first group of theories, the reason behind the phenomenon known as Balkanism is the convergence among the languages of the BLA. If this is correct, we may hypothesize that languages pertaining to the Balkan Sprachbund will be more transparent than related languages not belonging to the Balkan Sprachbund. After all, as argued above, language contact increases transparency. We will thus investigate transparency in two

languages of the BLA, Romanian and Bulgarian, and in two languages belonging to the same families (Romance and Slavic) but not to the BLA, namely Italian and Russian. If this

hypothesis is correct, then Romanian and Bulgarian will be more transparent than Italian and Russian.

The second group of theories on the BLA, however, proposes that a superstrate language, or diglossia, has influenced the linguistic development of the languages of the Balkan Sprachbund. As Kortmann (1998) shows, the prestige language could very well be the one used by the Greek Orthodox Church, namely Koine Greek and its successors, OCS and CS. Therefore, our second hypothesis will be that the language of the clergy is the underlying cause of Balkanism. We will thus add to the investigation another language, namely Modern Greek. If our second hypothesis is borne out and Koine Greek is the underlying cause, then its modern counterpart, Modern Greek, Romanian and Bulgarian will share the same features.

(14)

14

Hypothesis 1: Convergence forced the languages of the BS to become more transparent Prediction 1: Romanian will be more transparent than Italian

Bulgarian will be more transparent than Russian

Hypothesis 2: The language of the Orthodox Church, Koine Greek, and later OCS and CS, had an

influence on the linguistic development of the languages of the BS

Prediction 2: Bulgarian, Romanian and Greek possess the same features

The following section will outline the methodology.

3. Methodology

This chapter will outline the methodology used in this study. Section 2.1 will introduce the Functional Discourse Grammar framework, while Sections 2.2 and 2.3 will present the features and the methods and materials.

3.1 FDG

Functional Discourse Grammar (FDG) is a linguistic theory developed by Hengeveld & Mackenzie (2008) following Dik's (1978) model of Functional Grammar. While the latter is a predominantly functional model of language, FDG is a structural-functional framework, the goal of which is to find explanations for the structure of human language, while keeping in mind its communicative and functional nature. In FDG, the speaker’s communicative

intention is considered the starting point of the construction of a discourse act. It is, therefore, a top-down model of the linguistic organisation following the whole process of speech

formation: from conceptualization to phonetic (or orthographic) form.

The FDG model includes a Grammatical Component interacting with three non-grammatical components. These are the Contextual, Conceptual and Output Components. The Contextual Component contains information about the speech context, the Conceptual Component is the place where the intention is conceived, while the Output Component is where the message is articulated. The Grammatical Component is itself divided into several layers subject to a hierarchical order. The speaker's intention must first go through a

Formulation process, where it is translated into pragmatic (at the Interpersonal Level) and semantic (at the Representational Level) units. Afterwards, both units go through a

(15)

15 last transformation is through Phonological Encoding, a process that converts the

morphosyntactic units into phonological units landing at the Phonological Level. At this point, all units are grammatically complete and only need to be written, signed or spelled out. This process happens in the Output Component. Even though there is a hierarchy between levels, it is not obligatory to go through all of them. Certain intentions go directly from the Conceptual Component to the IL and then on to the PL, as in the case of Ouch!, which has no semantic or morphosyntactic counterpart. In other words, all four levels are independent but constantly interacting with each other. Fig. 3 below represents the general architecture of FDG. Processes are pictured in ovals and levels in rectangles.

Fig. 3: General architecture of FDG

Just like the components, every level has its own internal hierarchical structure. We refer the reader to Hengeveld & Mackenzie (2008) for an in-depth presentation of the framework.

(16)

16

3.1.1 Transparency in FDG

As outlined in 1.1 above, transparency can be defined as a one-to-one correspondence

between meaning and form. This definition is however quite general and the FDG framework can help us make it more precise. In FDG, a unit of meaning is a unit, i.e. a primitive, at either the RL or IL, while a unit of form is a primitive at the two lower levels, ML or PL. Given these premises, we could reformulate our transparency definition as follows (Leufkens, 2013:13):

a) Transparency is obtained when one unit at one of the upper two levels of linguistic organisation (IL, RL) corresponds to one unit at one of the lower two levels of linguistic organisation (ML, PL)

It is important to note once again that transparency is not a property of one level, but rather an interface property between levels. From this perspective, the definition in (a) is not precise enough. Within the Grammatical Component, there are four distinct levels. (a) above,

however, only considers the relations between the two upper and lower levels (IL/RL – ML/PL), whereas there are also relations between IL and RL and ML and PL. There are in fact six interfaces in total, namely between IL-RL, IL-ML, IL-PL, RL-ML, RL- PL, ML-PL. Transparency should thus be defined as a one-to-one correspondence between linguistic units between all four levels (Leufkens, 2013:13):

b) Transparency is obtained when one unit at one level of linguistic organisation corresponds to one unit at all other levels of organisation

Although, at first sight, definition (b) may seem less specific than the one in (a) above, it is more precise as it reflects the complexity of the interfaces between the grammatical

components. As outlined in section 1, all languages are opaque to some extent and only differ from one another with regard to their degree of transparency. Nevertheless, the latter must not be seen as a binary feature, but rather as a spectrum: a language can be more or less

(17)

17 transparent and its transparency degree may be different for the six different interfaces. The next section will present and explain the features investigated in this study.

3.2 The Features

For the present research, we follow the methodology proposed by Leufkens (2013 and 2015), namely the investigation of nineteen transparency features to which we added five typical of the Balkan Sprachbund. The features are in turn divided into four subgroups, namely

Redundancy, Discontinuity, Fusion, Form-based Form and BS Features. This section will introduce and explain each feature.

3.2.1 Redundancy

This subgroup includes all one-to-many relations between levels, as when one pragmatic, semantic or morphosyntactic unit corresponds to more than one semantic, morphosyntactic or phonological units. The redundant opacity features investigated in this study are the

following: Clausal Agreement and Cross-Reference, Phrasal Agreement, Concord and Tense Copying.

3.2.1.1 Clausal Agreement and Cross-Reference

Agreement is a morphosyntactic operation in which a semantic or grammatical property of one unit, i.e. the controller, is marked on another unit, i.e. the target.

Agreement is found in several contexts, but the phrase is definitely the most common (Corbett, 2006). FDG defines Clausal Agreement as a purely morphosyntactic operation internal to the ML, which copies some features of one unit onto another. The copy is thus semantically empty. FDG, however, also distinguishes another type of agreement operation, namely Cross-Reference, which is the multiple expression of one semantic unit. Hengeveld & Mackenzie (2008) propose a rule of thumb: if an element can occur on its own, it cannot be a copy. There is Clausal Agreement only when both units at ML are obligatory, as in English. We find Cross-Reference in the so-called pro-drop languages, where the subject can be omitted. Both types of agreement are highly opaque, as they create redundancy.

(18)

18

3.2.1.2 Phrasal Agreement

By Phrasal Agreement, we refer to the agreement between the noun and its modifiers, determiners and demonstratives. In some languages, the latter must agree with the noun in number, gender, case and definiteness, as we witness in the French noun phrase in (2) below. The need to mark gender and number on all elements is redundant and therefore opaque.

2 la bell-e fille

the.F.SG beautiful-F.SG girl The beautiful girl

3.2.1.3 Concord

The term Concord refers to the situation in which there are a morphosyntactic and a lexical unit expressing one single semantic meaning. There are three types of Concord: Plural, Negative and Temporal. As far as the former is concerned, in some languages number can be marked both morphosyntactically and lexically. In the English noun phrase two cars, for example, the plural is expressed twice, namely by two and by the morpheme -s, which clearly leads to opacity. Negative Concord is the coexistence of two negative elements in the same clause. The English Negative Polarity Item anyone is not semantically independent and thus needs to be supported by another negation, as in (3) below, which leads to opacity.

3 I haven’t seen anyone.

Finally, Temporal Concord is the co-occurrence of a morphosyntactic tense marker and a temporal adverb, as in (4). Concord is nearly universal (Leufkens, 2015) but will still be investigated for the sake of completeness.

(19)

19

3.2.1.4 Tense Copying

The last redundancy feature is Tense Copying, namely the multiple marking of time reference in a main and subordinate clauses. This copying mechanism is a morphosyntactic process that copies the tense value, for instance, past (said), of the main clause to its subordinates (was):

5 Mary said that she was studying Latin.

Being purely morphosyntactic and not semantically or pragmatically motivated, tense copying is to be considered opaque.

3.2.2 Discontinuity

Discontinuity is another one-to-many correspondence. This group includes all phenomena where one pragmatic or semantic unit is split-up into two or more morphological or phonological units. The discontinuity features investigated in this study are the following: Extraposition and Extraction, Raising, Circumfixes and Circumpositions, Infixes and Non-Parallel Alignment.

3.2.2.1 Extraposition and Extraction

At the IL and RL, modifiers and the nouns they modify belong together. However, at the ML they can sometimes be separated, with the modifier being moved to the right hand-side of the sentence, Extraction, or to the left periphery, Extraposition. The former is often recurred to when an element is too complex and heavy or when it is topicalized (6a). Extraposition (6b), on the other hand, is usually the result of focalization. Both phenomena are highly opaque.

6a [About dogs] we have several books in stock. Extraction 6b We have several books in stock [about dogs]. Extraposition

(20)

20

3.2.2.2 Raising

Certain verbs allow an argument semantically belonging to an embedded sentence to syntactically behave as an argument of the main clause (Leufkens 2015). This is the case of the English verbs seem and appear. In the example (7a) below, John is the subject of the embedded clause. In its semantically equivalent (7b), on the other hand, John behaves as the subject of the main clause. Raising is to be considered non-transparent.

7a It seems that John is smart 7b John seems smart

3.2.2.3 Circumfixes and Circumpositions

Circumfixes are particular kinds of affixes. They are one unit at IL and RL but are realised in two separate units at ML, such as the morphological marker of the past participle in German, e.g. ge-wuss-t 'known'. Circumpositions are freestanding words, therefore one unit at IL and RL, that are separated at ML, like the French negation ne...pas. Extremely rare, they are very opaque.

3.2.2.4 Infixes

As opposed to Circumfixes, an Infix is an affix inserted into a morphological unit. They are not discontinuous per se but they create discontinuity in the unit they are inserted in. The causative marker <[(o)ʔ]> or <[(o)ʔ_{b]> in Kharia is an example of infix. The word botoŋ}

‘fear’ is made causative, boʔtoŋ ‘scare’, by inserting the causative marker into it (Peterson, 2011: 231).

3.2.2.5 Non-Parallel Alignment

Units that belong together at higher levels can sometimes be merged at PL, causing a non-parallelism between the levels. This phenomenon is defined as Non-Parallel Alignment, as in the Dutch example (8) below, where ik and wou correspond to a single unit at PL and so do

(21)

21 8 [ik wou] [dat hij] kwam.

/kʋɑu dɑti kʋɑm/ I want.PST COMP he come.PST

‘I wish he would come.’

Non-Parallel Alignment

3.2.3 Fusion

Languages also show many-to-one correspondences, namely when two or more units on one level correspond to one single unit at another level. These correspondences are called Fusion. The Fusion features investigated in this research are the following: Cumulation of TAME and Case and Person and Case, Morphologically Conditioned Stem Alternation: Suppletion and Morphologically Conditioned Stem Alternation: Irregular Stem Formation.

3.2.3.1 Cumulation of TAME and Case and Person plus Case

With the word Cumulation, we refer to the expression of multiple meanings in a single grammatical unit. These are usually Affixes or Grammatical Words (Hengeveld, 2007), also known as ‘portmanteau morphs’ (Bauer, 2003). In fusional languages, like Italian,

Cumulation is very common. In (9) below, for example, the morpheme -o encodes tense, aspect, mood, person and number. In this case, we have Cumulation of TAME (Tense, Aspect, Mood and Evidentiality) and Person.

9 parler-ò

talk-IND.FUT.3.SG ‘I will talk’

Other semantic categories which are very commonly expressed by portmanteau morphs are gender, number, and, when present, case. An example of Cumulation of Case and Person is the genitive plural morpheme (for nouns) of the second declension in Latin, -orum. Both types of Cumulation are of course non-transparent.

(22)

22

3.2.3.2 Morphologically Conditioned Stem Alternation: Suppletion

Grammatical information is very often expressed through affixation, a process during which an affix is added to the stem. Another strategy it to make changes in the stem itself. One possible change is Suppletion, a morphological process during which the marking of specific information requires a stem which is not derivable from other stem forms of the same

Lexeme (Bauer, 2003:48 & Hengeveld, 2007:39). An example of Suppletion is the paradigm of the English verb to go: go, went, gone.

3.2.3.3 Morphologically Conditioned Stem Alternation: Irregular Stem Formation

Grammatical information can also be expressed by Irregular Stem Formation, namely a modification in the stem. This type of alternation, however, is purely morphological and must be distinguished from the morphophonologically driven ones which will be discussed in 3.2.4 below. According to Bauer (2003), we can distinguish between four different kinds of

Irregular Stem Formation. The first two are vowel and consonant mutations, like in the English paradigm begin-began-begun. The third kind is a modification of the segmental structure of the word, as in thief-thieve, where voicing defines if the word refers to an Individual or an action (State-of-Affairs in FDG terminology). Finally, we also witness suprasegmental modifications, namely in the stress pattern, e.g. INsult (noun) and inSULT (verb). Nevertheless, only irregular modifications, i.e. only applying to some stems, will be considered opaque. Regular ones are to be considered transparent.

3.2.4 Form-based Form

The last subgroup of opacity features groups together all zero-to-one correspondences between meaning and form. They are labelled Form-based Form because these formal units have no pragmatic or semantic counterparts. They are empty formal shells. The Form-based Form features investigated in this research are the following: Grammatical Gender, Syntactic Alignment, Nominal Expletives, Influence of Complexity on Word Order or Heavy Shift, Predominantly Head Marking, Morphophonologically Conditioned Stem Alternation, Morphologically Conditioned Affix Alternation and Conjugation/Declension.

(23)

23

3.2.4.1 Grammatical Gender

Languages tend to divide nouns into classes. And the classification can be of two kinds: lexical or semantic. An example of the former, also called Grammatical Gender, is noun classification in Dutch. In this language, the selection of the neuter het vs. non-neuter article

de is lexically driven and has no semantic motivation (Blom et al. 2008). Other languages,

like Kikongo, have several noun classes (ten in the case of Kikongo) and every class only contains nouns that belong together semantically (Dereau, 1995). A language exhibiting Grammatical Gender, like Dutch, is to be considered opaque.

3.2.4.2 Syntactic Alignment

In a clause, arguments can be expressed in different ways. FDG recognises three types of alignment: pragmatic, semantic and morphosyntactic. The former or interpersonal alignment is witnessed in Tagalog, where Topic arguments need to be marked by the particle ang= (10).

10 bumilí ang=lalake ng=isda sa=tindahan PFV.A.buy SPEC.TOP=man OBL=fish LOC=store ‘The man bought fish at the/a store.

Bickel (2011:8)

Interpersonal Alignment

The second type of alignment is semantic, or representational, alignment. Arguments can be either aligned based on a hierarchy of animacy and person, or based on their semantic role. In the latter case arguments are marked for categories such as Actor, Undergoer or Location (Hengeveld and Mackenzie, 2008). In Acehnese, for example, arguments are expressed through the use of clitics, depending on their semantic roles, e.g. =geuh for Undergoer:

11 gopnyan galak=geuh that

3.HON happy=3.HON.U very ‘He is very happy.

Durie (1985: 56)

Representational Alignment

Finally, another group of languages presents purely morphosyntactic alignment, which is, as opposed to the previous ones, opaque because the arguments’ alignment has no

(24)

24 counterpart at the IL and RL. This alignment leads to a zero-to-one relation between levels. English, for example, shows Grammatical Relations. (12) illustrates that the alignment of arguments is syntactically driven: he is the grammatical subject and therefore occurs in preverbal position.

12a He eats an apple.

He falls.

He was chased by the dog.

Syntactic Alignment 12b

12c

3.2.4.3 Nominal Expletives

Nominal Expletives are units needed at the ML to fill in the subject slot, without having any counterpart at IL and RL, and are therefore opaque. Also known as dummy subjects, they are mostly found in weather (13a) and existential (13b) predicates and non-raised constructions (13c).

13a It is snowing.

13b There is a dog in the garden. 13c It seems that John is tired.

Both verbs to snow, to be and to seem need a placeholder, either because the verbs have a zero argument structure or because the subject, in this case a dog and John, does not precede, as is usually the case, but follows the verb.

3.2.4.4.Influence of Complexity on Word Order or Heavy Shift

In FDG the order of constituents in the sentence is considered to be driven by their semantic and pragmatic status. This can, however, be overruled in some cases. If a certain constituent is morphosyntactically complex or heavy, it can be placed to the end of the sentence. The most common instances of Heavy Shift are shifts of NPs containing adnominal phrases or relative clauses, as in the English sentences in (14) below.

(25)

25 This is opaque, as the position of the girl with really long red hair is not semantically nor pragmatically motivated.

3.2.4.5.Predominantly Head Marking

Grammatical information can be marked in two ways: by means of affixes, which are head marking, or by means of clitics or free-standing morphemes, which are phrase marking. As far as affixes are concerned, it is the class or complexity of the host which defines the nature of the affix, causing a zero-to-one correspondence between the RL and the ML. Thus they are to be considered opaque. Free-standing morphemes and clitics are, on the other hand, more transparent because they are not defined by the class of complexity of the host. For this features it is impossible to assign a binary value, therefore we looked at what the predominant strategy is.

3.2.4.6.Morphophonologically Conditioned Stem Alternation

Adding a morpheme to a stem might cause the latter to undergo certain changes. These changes may be due to semantic or pragmatic reasons, as we saw in 3.2.3.3 above, or to pure morphological reasons, resulting in a zero-to-one relation between RL and ML. In Hungarian, for instance, by adding the imperative morpheme -s the final -t of the stem of the verb köt- 'to tie' becomes -š: köš-s 'tie!'.

3.2.4.7.Morphophonologically Conditioned Affix Alternation or Conjugation/Declension

Just like stems, affixes can also undergo alternations when added to certain stems due to morphophonological reasons. This is a specific phenomenon and only applies to certain affixes. It is however importat to distinguish between two kinds of Affix Alternation: a purely morphological and a morphoponological one. The latter is very rare, but we find an example of it in the West Greenlandic. When the affix –lirtuuq ‘one who likes’ is attached to a stem, its first consonant phonologically adapts to the last one of the stem:

15 sin-nirtuuq

(26)

26

The morphological type of Affix Alternation is, on the other hand, lexically driven by conjugation or declension classes. With the term conjugation, we refer to the affix mutating based on the class of the verb it attaches to, declension being its nominal equivalent. An example of the latter comes from Latin, in which nouns are lexically divided into five classes and every class requires a specific affix paradigm, as illustrated in Table 2 below.

Table 2: The Latin declension for the second and third masculine noun classes: 2st_group

puer ‘boy’

3rd_group

rex ‘king’

Nominative puer rex

Genitive puer-i reg-is

Dative puer-o reg-i

Accusative puer-um reg-em

Vocative puer rex

Ablative puer-o reg-e

This alternation is clearly opaque since it is purely morpho(phono)logically driven and has no semantic motivation.

3.2.5 The BS Features

Besides the nineteen features from Leufkens (2013 & 2015) we decided to include in this study five more morphosyntactic features typical of the Balkan Sprachbund, namely Dependent Verb Forms, Analytic Formation of Future and Subjunctive, Grammaticalized Definiteness, Object Clitic Doubling and Poor Case Morphology. The next subchapter will present the features within the transparency framework.

3.2.5.1 Dependent Verbs

As stated by Friedman (2006, 2011) and Tomić (2006), the lack of a proper infinitive and other non-finite verb forms is to be considered the trademark of the Balkan Sprachbund and is claimed to be common to all languages listed in 2.2 above. Rather than speaking of finite vs non-finite verbs, we will speak of dependence. An independent verb form is one that can

(27)

27 stand on its own in a main clause. A dependent one, on the other hand, cannot (Hengeveld, 1998). The latter are found in subordinate clauses (Dixon & Aikhenvald, 2006). The most common dependent verb forms are the following six: infinitive, participle, gerund, supine, subjunctive and nominalization (Hengeveld, 1998). In this subchapter, we will analyse what strategies our four languages use in order to express subordinate clauses of the three kinds: relative, complement and adverbial. Each language will be assigned one opacity point for every dependent verbal form it possesses. Analytical verb forms will be considered

transparent. By analytical we mean all verbs that are formed by an auxiliary or free-standing morpheme followed by an independent verb, like the future formation in English. This strategy is transparent because one free-standing particle marks by itself a certain mood, e.g. subjunctive or infinitive.

3.2.5.2 Analytic Future and Subjunctive Formation

A second fundamental feature Balkanists have long claimed to be common to all languages of the Sprachbund is the analytic, rather than suffixal, subjunctive and future formation

(Friedman 2006 and Tomić 2006). By suffixal we refer to the formation of a verb form by adding a specific morpheme to the verb root, as in the Italian future in (9) above. In contrast, the analytic formation implies the use of a free-standing marker or auxiliary followed by a finite verb form or the infinitive, as in (16) (Friedman, 2006).

16 I will go

The latter is, of course, more transparent, as one unbound item alone marks a specific mood or tense, as will does for future in English.

3.2.5.3 Grammaticalized Definiteness

Definiteness can be marked in several ways. One strategy is to use a determiner, marked for definiteness or indefiniteness (Lyons, 1999). This is highly transparent: one pragmatic feature corresponds to one single morphosyntactic unit. Other languages, however, do not have determiners, and must thereby resort to other opaque strategies in order to mark definiteness. Definiteness may, for instance, be expressed through constituent order, and thereby affect the one-to-one relation between a certain position and a certain syntactic or semantic function.

(28)

28

3.2.5.4 Clitic doubling

Object clitic doubling, also known as resumptive clitic pronouns, is a phenomenon by which an object clitic pronoun appears together with the noun phrase it refers to (Kallulli &

Tasmowski, 2008), as in the Italian example (17) below.

17 Il prosciutto l’ho comprato io.

DEF.M.SG ham CL.ACC.M.SG’have.PRS.1.SG buy.PCTP.PST NOM.1.SG I bought the ham. (lit. “I bought it the ham”)

Clitic doubling is, of course, opaque, as it gives rise to reduplication: two formal units refer to the same semantic unit.

3.2.5.5 Case Morphology

As claimed by Friedman (2006, 2011) and Tomić (2006) a.o., the languages of the Balkan Sprachbund have all simplified their inherited case morphology over time. The preference of a language to use prepositions over case morphology is considered transparent. Case is, on the contrary, considered opaque because it leads to both head marking rather than phrase marking, may lead to Cumulation (3.2.3.1 above) and, when the preposition is present too, reduplication.

3.2.6 Summary of all the features

Before starting to present the methodology, we propose a summary of all the transparency features investigated in this study in Table 4 below.

Table 4: Summary of all transparency features

Feature Transparent Value Opaque Value

Redundancy

Clausal Agreement or Cross-Reference Absent Present

Phrasal Agreement Absent Present

(29)

29

Tense Copying Absent Present

Discontinuity

Extraposition and Extraction Absent Present

Raising Absent Present

Circumfixes and Circumpositions Absent Present

Infixes Absent Present

Non-Parallel Alignment Absent Present

Fusion

Cumulation of TAME and Case Absent Present

Morphologically Conditioned Stem Alternation: Suppletion

Absent Present

Morphologically Conditioned Stem Alternation: Irregular Stem Formation

Absent Present

Form-based Form

Grammatical Gender Absent Present

Syntactic Alignment Absent Present

Nominal Expletives Absent Present

Influence of Complexity in Word Order or Heavy Shift

Absent Present

Predominantly Head Marking Mostly phrase marking Mostly head marking Morphophonologically Conditioned Stem

Alternation

Absent Present

Morphologically Conditioned Affix Alternation and Conjugation/Declension

Absent Present

BS Features

Dependent Verbs Absent Present

Future and Subjunctive Formation Analytic Suffixal

Grammaticalized Definiteness Present Absent

Clitic Doubling Absent Present

(30)

30

3.3 Materials and Methods

For the present study, the research was conducted in two different ways. We made use of reference grammars, as well as native speaker consultants. Table 5 below lists the references and the native speaker's names.

Table 5: Reference grammars and native speakers consultants

Language Reference grammars Native Speakers Consultants

Italian None -Myself

Russian Timberlake, A., 2004, A Reference Grammar of Russian, Cambridge University Press;

-Ekaterina Shilova -Olesia Iagovdik

Romanian -Dindelegan, G. P. 2013. The Grammar of Romanian, Oxford

University Press;

-Dobrovie-Sorin, C & I. Giurgea. 2013. A reference grammar of Romanian. Volume 1. The noun phrase, John Benjamins Publishing Company: Amsterdam;

-Hoffman, C. N. 1989. Romanian Reference Grammar, Foreign Service Institute, U.S. Dept. of State, Washington D.C.;

Mallinson, G. 1986. Rumanian, Croom Helm: London;

-Cristina Şiclovan -Dorin Perie

Bulgarian -Scatton, E. A. 1984. A Reference Grammar of Modern Bulgarian, Slavica Publishers, Inc.;

-Preslava Petrova -Magdalena Nedelova

Greek -Holton, D., P. Mackridge & I. Philippaki-Warburton. 2004. Greek: An Essential Grammar of the Modern Language, Routledge;

-Holton, D., P. Mackridge & I. Philippaki-Warburton. 2012. Greek: A Comprehensive Grammar, 2nd Edition, Routledge;

-Anonymous on

www.hinative.com, platform dedicated to native speakers’ grammaticality judgments

The core method of this research has been to consult the aforementioned grammars and native speakers in order to attest the presence or absence of opaque features listed in 3.2 above. For the aim of this study, we considered one example of opacity as sufficient evidence to define that feature as opaque (leading to a + value). For two the BS features, Dependent Verbs and Case Morphology, we decided to also add between parenthesis the number of cases and dependent verbs languages show, e.g. + (4). If a language was considered transparent with regards to a certain feature than a negative value (-) was assigned. Note that we decided to count opaque rather than transparent features because of our decision to consider one

(31)

31 instance as sufficient evidence for opacity. In case there was no literature available on that particular phenomenon we assigned the value No Data (ND).

4. The Results

The present section will outline the results, language by language. We would like to point out that, with regard to the transliteration of Russian, Bulgarian and Greek, we adopted the ALA-LC Romanization Tables from the Library of Congress.

4.1. The Italian Data

Out of the twenty-four features investigated in this study, Italian has nineteen. It is thus to be considered quite opaque. This chapter will outline the Italian data, divided by subgroups. A summary table will conclude the chapter.

4.1.1. Redundancy

The subcategory Redundancy groups together four features: Cross Reference or Clausal Agreement, Phrasal Agreement, Concord and Tense Copying. Italian is quite opaque in these respects and shows to have them all.

Cross Reference Phrasal Agreement Concord Tense Copying

+ + + +

In Italian the predicate always agrees with its subject in person and number. The predicate can stand on its own, which implies that the subject is simply a spelled-out

replication of the person and number features already present on the verb. The subject is only overtly pronounced when it has a contrastive focal or topical function. As (18) below shows, this is a clear proof that the agreement relation in Italian is indeed Cross-Reference:

18 Veng-o domani. Io veng-o domani, tu sabato.

come-PRS.1sg tomorrow 1sg.NOM come-PRS.1sg tomorrow, 2sg.NOM Saturday I am coming tomorrow. I am coming tomorrow, you are coming on Saturday.

Agreement is obligatory on the phrasal level, where modifiers, determiners and demonstratives must all agree in person and number with the noun:

(32)

32 19 Le vecchi-e signor-e

DET.F.PL old.F.PL lady.F.PL The old ladies.

Both Concord and Tense Copying are present in Italian. (20) shows all types of concord found in this language.

20a Plural Concord Due libr-i Two book-M.PL

20b Negative Concord Non ho vist-o nessuno.

NEG have.1sg.PRS see.PRT.PST-M nobody I haven’t seen anyone.

20c Temporal Concord Ieri ho studiato

Yesterday have.1sg.PRS study.PRT.PST-M Yesterday I studied.

Tense Copying is only obligatory in certain contexts. The tense of the subordinate clause relates to the external time of speech. In (21a) below Maria was studying Latin at the same time of the internal speech which is in the past (ha detto ‘she said’), therefore the past tense of the main clause is copied to the embedded clause. In (21b) the tense feature is not copied to the subordinate clause. The sentence acquires a different meaning and the embedded clause acquires an imperfective aspect: Maria said she studies Latin (today).

21a Maria ha dett-o che studi-av-a Latino. Tense Copying Maria have.2sg.PRS say.PRT.PST-M COMP study-IMP-3sg Latin

Mary said she was studying Latin.

21b Maria ha dett-o che studi-a Latino. No Tense Copying Maria have.2sg.PRS say.PRT.PST-M COMP study.PRES-3sg Latin

Mary said she studies Latin.

4.1.2. Discontinuity

Of the five Discontinuity features investigated Italian only presents three: Extraposition and Extraction, Raising and Non-Parallel Alignment.

(33)

33 Extraposition and Extraction Raising Non-Parallel Alignment Infixes Circumfixes, Circumpositions + + + - -

Extraposition and Extraction are very common in Italian. It is especially found in sentences with a specific pragmatic meaning, such as focus and topic. (22a) is unmarked, while in its marked counterpart (22b) the adjunct “di Clint Eastwood” is extraposed.

22a Abbiamo molti film di Clint Eastwood a casa.

Have.PRS.1PL many.M.PL film by Clint Eastwood at home Non extraposed Abbiamo molti film a casa di Clint Eastwood. Extraposed 22b We have many movies by Clint Eastwood at home.

Raising is present is Italian but only witnessed with two verbs, namely sembrare ‘seem’ and parere ‘seem, appear’.

23a Sembr-a che Mario sia stanc-o.

seem.PRS-3SG COMP Mario be.SUBJ.PRS.3SG tired-M.SG Unraised

It seems that Mario is tired. Raised

23b Mario sembr-a stanc-o. Mario seem.PRS-3SG tired-M.SG Mario seems tired.

The last feature of this subgroup that we find in Italian is Non-Parallel Alignment, which is found in sentences containing phonologically weak elements such as clitics. In (24) the dative clitic gli ‘to him’ is spelled out together with the auxiliary ho ‘have’.

24 Gli ho aperto.

3SG.DAT have.PRS-1SG open.PST.PRT.SG ˈʎɔ aˈperto I opened the door for him.

Infixes and Circumpositions are, on the other hand, not present in this language.

4.1.3. Fusion

All three Fusion features investigated are present in Italian: Cumulation of TAME plus Person, Morphologically Conditioned Stem Alternation: Suppletion and Irregular Stem

(34)

34 Formation.

Cumulation of TAME Suppletion Irregular Stem Formation

+ + +

Even though case is not marked on Italian verbs, there is always Cumulation of TAME. As example (25) below clearly shows, the portmanteau morpheme -o encodes person, number, aspect and tense.

25 Legg-o

read-IND.PRS.PFV.1SG'

As far as Suppletion is concerned, it is found in the paradigm of some irregular verbs such as andare ‘to go’ (26a) and essere ‘to be’ (26b), where the verbal stem changes

throughout the conjugation:

26a vad-o andav-o andr-ò

I am going I was going I will go

26b sono ero sarò

I am I was I will be

Another instance of Suppletion is found in the declension of personal pronouns, which have three different forms: nominative, dative and accusative:

Table 6: The declension of Italian pronouns

Nominative Dative Accusative

1SG io mi, me mi, me

2SG tu ti, te ti, te

3SG egli, essa, esso lui, lei

gli, le la, lo

1PL noi ci ci

2PL voi vi vi

3PL essi, esse loro li, le

(35)

35 conjugation of certain verbs, which show an alternation of the stem vowel in the perfect tense. This behaviour is only peculiar of some verbs and not characteristic of a specific verbal class. Table 7 below shows the perfect tense conjugation of fare ‘to do’ and vedere ‘to see’. The thematic vowels a and e are substituted by e and i respectively in the 1st_{and 3}rd_singular

and 3rd_plural.

Table 7: perfect tense conjugation of fare and vedere

fare ‘to do’ vedere ‘to see’

1SG fec-i vid-i 2SG fac-esti ved-esti 3SG fec-e vid-e 1PL face-mmo ved-emmo 2PL fac-este ved-este 3PL fec-ero vid-ero 4.1.4. Form-based Form

This subcategory includes all forms that do not have a higher counterpart at IL and RL. Italian is quite opaque in this respect. Out of the seven features investigated, it has six: Grammatical Gender, Syntactic Alignment, Heavy Shift, Predominantly Head Marking, Morphophonologically Conditioned Stem Alternation and Affix Alternation and Declension.

Grammatical

Gender Alignment Syntactic Heavy Shift Predominantly Head Marking Alternation Stem Alternation/ Affix Declension

Nominal Expletives

+ + + + + + -

Nouns in Italian can have two genders: masculine and feminine. In certain cases, such as for jobs and pets, gender is semantically assigned, but as far as objects and most animals are concerned, gender is assigned arbitrarily:

27 tavol-o sedi-a ippopotam-o tigr-e

table-M.SG chair-F.SG hyppo-M.SG tiger-F.SG

Italian has Syntactic Alignment. The position of the arguments in the clause is purely driven by their syntactic functions. In (28a) below Mario is the Actor and the subject. On the

(36)

36 contrary, in its passive counterpart (28b) Mario is the Actor but not the subject. The latter is the Undergoer of the action, la mela ‘the apple’.

28a Mario mangi-a la mela.

Mario eat-IND.PRS.3SG DET.F.SG apple Active Mario eats the apple.

28b La mela viene mangi-at-a da Mario.

DET.F.SG apple come-IND.PRS.3SG eat-PST.PRT-F.SG by Mario Passive The apple is eaten by Mario.

Complexity influences word order in Italian. When particularly heavy, constituents can be moved to the end of the clause. This often happens to arguments with heavy specifiers (29b) or followed by a relative clause (29c):

29a Ieri ho visto la tua amic-a al parco. yesterday have.PRS.1SG see.PST.PRT DET.F.SG POSS.F.2SG friend at park 29b Ieri ho visto al parco la tua amica con quel gross-o cane ner-o.

with DEM.M.SG big-M.SG dog black-M.SG 29c Ieri ho visto al parco la tua amica che studi-a economia con Giuseppe. who study-PRS.3SG economy with Giuseppe

Yesterday I saw your friend <with that big black dog/who studies Economy with Giuseppe> in the park.

In Italian only definiteness and possession are marked through free-standing

morphemes on the phrase level. All other grammatical information is marked through the use of affixes, which are always Head Marking, as (30) below shows for the plural morpheme -e.

30 Quell-e vecchi-e e simpatich-e signor-e DEM-F.PL old-F.PL and nice-F.PL lady-F.PL Those old and nice ladies.

Morphophonologically Conditioned Stem Alternation is quite rare in this Romance language. Within the nominal domain it can be found in the formation of plural nouns. When adding the plural morpheme -i to the stem of amic- ‘friend’, the last phoneme of the stem undergoes a phonological change: it becomes a palato-alveolar affricate.

(37)

37 31 amic-o /amiko/

amic-i /amitʃi/

Morphologically Conditioned Affix Alternation is only found in the verbal domain in Italian, namely in the verbal Conjugation. As Table 8 below shows, most verbal affixes behave in a regular way. The third person singular morpheme is nevertheless still conjugation-dependent.

Table 8: Conjugation of amare, credere and dormire:

1st_group

amare ‘to love’

2nd_group

credere ‘to believe’

3rd_group

dormire ‘to sleep’

1SG am-o cred-o dorm-o

2SG am-i cred-i dorm-i

3SG am-a cred-e dorm-e

1PL am-iamo cred-iamo dorm-iamo

2PL am-a-te cred-e-te dorm-i-te

3PL am-ano cred-ono dorm-ono

Finally, Nominal Expletives are not present in Italian.

4.1.5 BS Features

In this last subchapter we will outline the results for Italian with regards to the BS features. These features are: Dependent Verb Forms, Definiteness, Analytic Future and Subjunctive Formation, Clitic Doubling and Case Morphology. Of these five features, Italian is only opaque regarding three of them: Dependent Verbs, Clitic Doubling and Analytic Future and Subjunctive Formation.

Dependent Verbs Definiteness Analytic Future and

Subjunctive Formation

Clitic Doubling Case Morphology

+ (4) - + + -

With regard to Dependent Verbs, Italian is very opaque. Of the six dependent verbal forms we investigated, it has four: infinitive, gerund, participles and subjunctive.

(38)

38 The infinitive is the most common in this language and it is used in both complement and adverbial clauses. In complement clauses where the subjects of the main and subordinate clauses are identical, the infinitive is almost always obligatory:

32 Vogli-o andare

want-PRS.1.SG go.INF

*Voglio che vado

COMP go.PRS.1.SG

I want to go.

The gerund is only used in adverbial clauses:

33 Spingendo una sedia arriv-ò alla porta

push.GER DET.F.SG chair arrive-PST.1.SG to.DET.F.SG door By pushing a chair s/he got to the door. Nedjalkov (1998:450)

Present participles are rarely used in spoken Italian in adverbial clauses, but they remain in certain idiomatic expressions (34a). In relative clauses they are slightly more common (34b).

34a Vole-nte o nole-nte, devi uscire.

want-PTCP.PRS or not.want-PTCP.PRS must-PRS.2.SG go.out

Wanting it or not, you must go out. Haspelmath and König (1998:602)

34b Quest-a è una pianta provenie-nte

DEM-F.SG be.PRS.3.SG INDEF.F.SG plant come-PTCP-PRS

dall’ Africa

from.DEF Africa

This is a plant that comes from Africa.

Past participles are conversely very common in both adverbial (35a) and relative clauses (35b).

35a Finita la scuola, vado a Parigi.

finish.PTCP.PST DET.F.SG school, go-PRS.1.SG to Paris Once I finish school, I will go to Paris.

35b Ved-o la macchina della ragazza ucci-sa.

see-PRS.1.SG DEF.F.SG car of.DEF.F.SG girl murder-PTCP.PST.F.SG I see the car of the girl who was murdered.

(39)

39 Finally, the subjunctive is found in both complement (36a) and adverbial clauses. In some cases, the use of the subjunctive is obligatory (36b).

36a Pens-o che sia bello.

think-PRS.1.SG COMP be.SUBJ.PRS.3.SG beautiful

I think it’s beautiful.

36b Nonostante sia/*è freddo, …

despite be.SUBJ.3.SG/be.PRS.3.SG cold Despite it being cold, …

Italian has a very transparent strategy to mark definiteness. It has two sets of free-standing determiners, one marking definiteness (37a) and one indefiniteness (37b).

37a Ho visto il cane.

have.PRS.1.SG see.PTCP.PST DEF.M.SG dog

37b Ho visto un cane.

have.PRS.1.SG see.PTCP.PST INDEF.M.SG dog

I have seen the/a dog.

Being a fusional language, Italian is highly opaque with regard to the Future and Subjunctive formation. Both future (38a) and subjunctive (38b) are formed in a non-analytical way.

38a Ci andr-ò domani.

CL.LOC go.FUT-FUT.1.SG tomorrow I will go tomorrow.

38b Che vad-a a casa!

that do.SUBJ-SUBJ.3.SG to home

Let it go home!

In Italian resumptive clitic pronouns are very common, but confined to the spoken variety.

39 Il prosciutto l’ho comprato io.

DEF.M.SG ham CL.ACC.M.SG’have.PRS.1.SG buy.PCTP.PST NOM.1.SG I bought the ham.