• No results found

Development and variation of non-V2 order in Norwegian wh-questions

N/A
N/A
Protected

Academic year: 2021

Share "Development and variation of non-V2 order in Norwegian wh-questions"

Copied!
49
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

Development and variation of

non-V2 order in Norwegian

wh-questions

Supervisor:

prof. dr. A.P. (Arjen) Versloot

Amsterdam Center for Language and Communication

Second reader:

prof. dr. F.P. (Fred) Weerman

Amsterdam Center for Language and Communication

Number of words:

17.139

Maud Westendorp

10678646

Thesis (30 ECTS)

Research Master Linguistics

University of Amsterdam

January 2017

(2)
(3)

Development and variation of non-V2 order in Norwegian wh-questions

Maud Westendorp

Abstract

Across Norwegian dialects wh-questions show variation with respect to word order possibilities, with many dialects allowing non-V2 word order. The acceptance of non-V2 orders differs considerably across dialects and further depends on the complexity and function of the wh-element. Many studies have attempted to explain this variation and the development of non-V2

wh-questions (a.o. Nordgård 1988; Lie 1992; Vangsnes 2005; Westergaard et al. 2012, 2017). In

this study, both synchronic data from the Nordic Syntax Database as well as historical data is examined and it is hypothesised that the synchronic variation between the dialects mirrors the diachronic development from V2 to non-V2. Additionally, I hypothesise that non-V2 word order first developed in in Central and Northern Norwegian in subject wh-questions with the complementizer som in the verb-second position and later spreads to non-subject as well as complex wh-questions. An apparent-time study of the synchronic data shows a diachronic connection between some but not all of the synchronic varieties but no evidence was found to support the hypothesis that non-V2 first emerged in subject wh-questions. On the basis of historical sources, it is shown that non-V2 likely developed at the end of the 19th century in

Central and Northern Norwegian. Furthermore, emergence of non-V2 order is linked to the loss of the present tense marker on the finite verb allowing other elements to fill the verb-second position and gain the abiliy to lexicalise this position thus removing the trigger for V2 resulting in the emergence of non-V2 word order. Combining the findings from the synchronic and the historical data, it is made clear that non-V2 word order started in simplex wh-questions which are shown to be most frequent and subsequently spreads to other, less frequent, types of questions.

(4)

Table of contents

1 Introduction 4

2 Background 5

2.1 Verb-second and verb-second violations in Norwegian 5

2.2 Word order variation in Norwegian dialects 9

2.2.1 Synchronic variation within dialects 9

2.2.2 Development of non-V2 word order 12

2.2.3 Synchronic variation and diachronic change connected 14

2.3 Hypotheses and predictions 17

3 Methods 19

3.1 Critical assumptions and prerequisites of methodology 20

3.1.1 Connection between synchronic variation and diachronic change 20

3.1.2 Combining formal-theoretical and quantitative-statistical linguistics 21

3.2 Examining synchronic distribution 21

3.2.1 Data collection 21

3.2.2 Coding NSD-data 22

3.2.3 Data analysis 23

3.3 Studying diachronic development 24

3.3.1 Data collection 24

3.3.2 Data analysis 25

4 Results 25

4.1 Presentation and interpretation of the quantitative data 25

4.1.1 Nordic Syntax Database 25

4.1.2 Mapping aggregate variation 36

4.1.3 Nordic Dialect Corpus 38

4.2 Presentation and interpretation of the qualitative data 39

4.2.1 Findings in historical sources 39

4.2.2 Connection with loss of final /r/ 41

5 Discussion 43

(5)

1

Introduction

This study investigates variation across Norwegian dialects with respect to the word order possibilities in wh-questions by examining the synchronic distribution as well as the diachronic development of this phenomenon.

Standard Norwegian is a verb-second (V2) language and has a V2-requirement in all main clauses. In contrast with Standard Norwegian, many Norwegian dialects however lack verb-second order in wh-interrogatives. An example of an interrogative with non-V2 word order from the Nordic Dialect Corpus (NDC) (Johannessen et al. 2009, 2010) is given in (1).

1. Ka du mein me å karrakteriser språk-e? (stamsund_04gk) what you mean with to characterise language-DEF

‘What do you mean with characterizing language?’

Many earlier studies examining the order variation in these wh-questions focus exclusively on the synchronic variation within a single dialect or region. Recently however, Westergaard et al. (2017)1

have proposed a diachronic account of the distribution across Norway on the basis of acceptability judgements from the Nordic Syntax Database (Lindstad et al. 2009). In my opinion, this data can be examined in more depth and further integrated with diachronic material. I hope to contribute to the ongoing debate on non-V2 word order by analysing and combining both historical and synchronic material. On the basis of synchronic dialect data and historical sources I show that non-V2 word order in Norwegian wh-questions developed in the late 19th century in Central and Northern Norwegian. I

suggest that the loss of the present tense marker -r is connected to the reanalysis of different items occupying the verb-second position, resulting in the development of non-V2 word order. Non-V2 word order starts in simplex wh-questions and subsequently spreads to other complex wh-questions which are shown to be much more infrequent.

The outline of the paper is as follows. In the next section, I outline some background word order variation in wh-questions in Norwegian, focusing on the factors guiding this variation. In addition, this section covers the hypotheses and predictions for the present research. Section 3 discusses important theoretical and methodological notions concerning the relation between synchronic variation and diachronic change and the application of quantitative-statistical methods in linguistic research before outlining the method of the study. Section 4 gives a presentation and interpretation of the data. I will unify and discuss the impressions from the synchronic and diachronic data in the final section.

1 This publication is forthcoming. Page numbers refer to pre-print version received from Øystein A. Vangsnes

(6)

2

Background

2.1 Verb-second and verb-second violations in Norwegian

The term verb-second (V2) is used to describe the set of rules underlying the obligatory movement of the finite verb to the second position, either specifically in main clauses or in all finite clauses (Holmberg 2015). All modern Germanic languages, with the noteworthy exception of English, are V2. It is unclear when this rule became obligatory in Germanic; unfortunately, the introduction of V2 took place in a period “which has left us very little material on which to base hypotheses about its origin” (Faarlund 2010:207). It is theorised that the rule emerged when movement of auxiliaries to the second position was reanalysed from being just a phonological rule of cliticization to being a syntactic movement rule. This movement was subsequently generalised to all finite verbs (Faarlund 2010:208). English has lost the verb-second rule during the Middle English period (early 15th c.),

presumably due to a change in the inflectional morphology (see e.g. Haeberli 2002, Faarlund 2010). Norwegian is a Germanic language spoken by approximately 5 million speakers. Outside of Norway the language is predominantly spoken in Denmark, Finland and Sweden (Lewis 2009). Norway has an unusual language situation in which two officially recognised literary varieties exist. The syntactic patterning of both varieties, called Bokmål (‘book language’) and Nynorsk (‘new Norwegian’), is to a large extent the same. The former variety is based on the written Danish standard that supplanted the traditional Norwegian from the 16th century onwards; Nynorsk is a variety created by

Ivar Aasen (1813-1896) on the basis of spoken dialects that are derived from Old Norse (Askedal 1994:219-21). In this thesis, the term ‘Standard Norwegian’ is used to indicate the standard written variety of Bokmål as well as Nynorsk which both have a strict V2-requirement. All Norwegian dialects are mutually intelligible and can be divided into four main groups: East Norwegian (with Oslo as a regional capital), West Norwegian (Bergen/Stavanger), Central Norwegian (Trondheim) and Northern Norwegian (Tromsø) (Mæhlum & Røyneland 2012:25) (see Figure 2.1). The spoken language of the capital Oslo is often considered the most prestigious standard, but regional standards also exist (e.g. Bergen, Trondheim) (Askedal 1994; Sandøy 1998).

(7)

The predecessor of the Modern Scandinavian languages, Old Norse, had a clear V2 requirement in both matrix as well as subordinate clauses (Faarlund 2004:191). All modern Scandinavian varieties but Modern Icelandic have lost this requirement in subordinate clauses (Faarlund 2010:209). Norwegian is generally considered to be a restricted verb-second language (Wiklund et al. 2007). In a ‘restricted’ verb-second language, V2 is required in main but not subordinate clauses. Remarkably, many dialectal varieties of Norwegian allow non-V2 word order alongside V2 word order in matrix wh-questions. Two examples of Norwegian wh-questions with non-V2 order are given in (2). These sentences are produced by two different dialect speakers from Myre, Nordland in Northern Norway (2a) and Gausdal, Oppland in Central Norway (2b). In (2a) the wh-element korr ‘how’ is a non-subject, in (2b) the wh-wh-element åkkje ‘who’ functions as a subject. In the Standard Norwegian version of these examples (3) the verb has to be placed in the V2-position. Note that the complementizer som ‘that’ therefore is not present in (3b) below. As shown by these examples, non-V2 word order is possible both with complex or long wh-elements such as how ‘hvordan’ or hvor mye ‘how much’ and with simplex or short wh’s.

2. a. Korr de går me denn ær mottosjporrtklubben? (myre_01um)

how it goes with that there motorclub

‘How is (it with) that motorclub?’

b. Åkkje såmm driv me di ra? (gausdal_05um)

who COMP works with that then ‘Who is dealing with that?’

3. a. Hvordan går det med den der motorsportklubben? (St. Norw.) how goes it with that there motorclub

b. Hvem driver med det da? (St. Norwegian)

who works with that then

Deviations from V2 are also possible in yes/no questions (see (4) on the next page). In some Norwegian dialects in the county Rogaland, embedded dependent questions and matrix yes/no-questions such as (4a) may be introduced by om ‘if, whether’ (see (4b)). This construction has been described by Enger (1995), Lie (1992) and Vangsnes (1996) and is considered a syntactic innovation, as it is never mentioned in dialect description written at the start of the 20th century. Furthermore,

Opsahl (2010) and Freywald et al. (2015) have shown deviations from V2 occur frequently in main declarative clauses in urban contemporary vernaculars such as the multiethnolect spoken by adolescents in multi-ethnic areas of Oslo. These multiethnolectal varieties lie outside the scope of the current study, which will focus only on matrix wh-questions.

(8)

4. a. Kann æ svare de? (botnhamn_03)2

can I answer that

‘Can I answer that?’

b. Om eg kan få ei kaga? (Lie 1992:68)

if I can get a cake

‘Can I get a cake?’

As mentioned above, Norwegian is a restricted V2 language with obligatory V2-movement only in main clauses. The non-V2 order that is visible in the wh-questions in (2) is therefore equivalent to the order of embedded sentences (also embedded wh-questions) in Norwegian. The complementizer som that also appears in subject wh-questions with non-V2 order is often used to introduce such relative and embedded wh-clauses. The distribution of som displays a subject/object asymmetry; som is obligatorily present with subject wh extractions (5a), but ungrammatical in object wh extractions (5b) (Franco & Boef 2015). In contrast with the ungrammaticality of som in embedded object wh-questions,

som is optional with object extractions in relative clauses in Norwegian (5c).

5. a. Hun spurte [hvem *(som) kom].

she asked who COMP came

‘She asked who came.’

b. Hun spurte [hvem (*som) Maria ringte]. she asked who COMP Maria called ‘She asked who Maria called.’

c. Hun hater mann-en (som) Maria ringte. she hates man- DEF COMP Maria called

‘She hates the man that Maria called.’

The word order pattern in non-V2 subject wh-questions (e.g. (2b)) is the same as the word order pattern in embedded subject questions (5a). The non-V2 word order with som in subject wh-questions therefore is no autonomous order but in fact mirrors the order of the embedded question. The fact that som is obligatory in the context of subject extraction is vital to the distinction between V2 and non-V2 orders in Norwegian; for subject questions normally display no difference between V2 and non-V2 order in a SVO-language. Also, as som in the embedded object wh-question in (5b) is ungrammatical, this word order patterns with the non-V2 object wh-question in (2a). In that sense, these sentences also have the same underlying pattern. The examples in (5) all have short wh-elements, but the same patterning holds for (embedded) questions with long wh-elements (p.c. M.

(9)

Berg-Leirvåg; K. Aslaksen, January 24-26, 2017). The word order observed in non-V2 interrogatives also mirrors word order in focus constructions or clefts. In spoken Norwegian, this construction is extremely common (Faarlund et al. 1997:1091); subject as well as non-subject wh’s and both simplex and complex questions can be used in cleft constructions:

6. a. Hvem var det som ringte?

who was it that called

‘Who was calling?’

b. Hvorfor er det du gjør det?

why is it you do that

‘Why are you doing that?’

It is important to note that there is no verb movement to the Left Periphery in non-V2 wh-questions. This becomes clear when we consider the position of the verb in relation to sentence adverbs as in these (slightly modified) examples from (Westergaard et al. 2017:4):

7. a. Kem du (*skal) ikkje skal møte i bar-en?

who you shall not shall meet in bar-DEF

‘Who will you not meet in the bar?’

b. Kem som (*har) ikkje har vært i bar-en?

who that has not has been in bar-DEF

‘Who has not been in the bar?’

As can be seen in (7) above, the finite verb appears to the right of the sentence negation ikkje ‘not’. In the Standard Norwegian versions of these examples (see (8)), the finite verb is positioned to the left of the negative element such that it occupies the V2-position. The lack of verb movement in (7) further confirms that the word order in these non-V2 questions is the same as the word order in embedded sentences.

8. a. Hvem skal du ikke (*skal) møte i bar-en? (St. Norwegian)

who shall you not shall meet in bar-DEF

b. Hvem har ikke (*har) vært i bar-en? (St. Norwegian)

who has not has been in bar-DEF

Summarizing over all types of matrix wh-questions exemplified in this section, there are four types of questions allowing non-V2 word order in Norwegian: simplex and complex wh-questions with either

(10)

subject or non-subject wh-elements. The current study will revolve around these four types of questions; examples of each type are given in Table 2.1.

Table 2.1

Types of questions allowing non-V2 order (from Nordic Dialect Corpus)

Question type Example Source

short subject wh

Åkkje såmm driv me di ra?

who comp work.PRS with that then

‘Who is dealing with that?’

gausdal_05um

long subject wh

Hvor mye kollektivtrafikk som er til Kvalsvika om somrene? how much public transport COMP be.PRS to Kvalsvika in summer.PL

‘How much public transport is there to Kvalsvika in the summer?’

heroeyMR_01um

short non-subject wh

Ka du ha jorrt på skola i dag? what 2SG have.PRS done.INF at school today

‘What did you do at school today?’

ballangen_02uk

long non-subject wh

Koss'n e kåmm mæ dit ? how 1SG come.PST REFL here

‘How did I get here?’

kirkesdalen_01um

2.2 Word order variation in Norwegian dialects 2.2.1 Synchronic variation within dialects

The non-V2 phenomenon in Norwegian wh-questions has been described in both dialectological and theoretical literature to a notable extent. In this section, descriptions and analyses of synchronic variation will be discussed before moving to a discussion of the development of this construction.

First of all, data from the Nordic Dialect Corpus and Syntax Database (Johannessen et al. 2010) show that the non-V2 word order is accepted in large parts of Norway (see Figure 2.1). However, dialects differ significantly from each other with respect to which types of questions allow non-V2 word order. Both the syntactic properties as well as the function of the wh-element seem to play an important role in determining the attested variation. Non-V2 wh-questions with simplex wh-phrases (ka ‘what, kor ‘where’, kem ‘who’) that function as subject (with obligatory som in the V2 position) are most widespread (Figure 2.2; see also (2b) for an example). Non-V2 order in other types of wh-questions, where the wh-element functions as a non-subject (distribution in Figure 2.3) or questions that start with a complex wh (e.g. korfor ‘why’, korsen ‘how’) are not as frequently accepted.

Figure 2.2 Acceptability judgements of

simplex, subject wh-question with non-V2 word order (NDS #17).

(black = low score, white = high score)

Figure 2.3 Acceptability judgements of

(11)

In a study of the wh-questions available in the Nordic Dialect Corpus, Vangsnes & Westergaard (2014) demonstrated that 40,5% of the produced wh-questions had non-V2 word order. Importantly, of the 540 non-V2 wh-questions found only 40 (8%) started with a complex wh-element. Overall, simplex

wh-questions are almost trice as frequent as complex wh-questions. (2014:142).

Previous studies have shown that dialects allowing non-V2 word order also all allow wh-questions with V2-word order (e.g. Vangsnes et al. 2016). Interestingly, it is generally assumed that the two orders do not differ systematically from each other in semantics (Westergaard 2003:82). Pragmatic factors do factor into the variation however (Lie 1992:73, Westergaard 2003, Vangsnes 2005:198). Most research on this topic has focussed on word order variation within specific Norwegian dialects, singling out different linguistic and sociolinguistic variables influencing the word order choice between the V2 and non-V2 order. Different linguistic and sociolinguistic variables have been shown to influence the choice between the V2 and non-V2 order within a specific dialect. These factors include inter alia the type of question word (monosyllabic vs. multisyllabic) (Åfarli 1986), the information status of the subject (i.e. new or given information (Westergaard 2003)), the choice of verb and form of the subject (i.e. full DP in V2-constructions v. (personal) pronoun with non-V2. (Westergaard & Vagsnses 2005)) and possibility of insertion of the complementizer som ‘that’ under embedded subject extraction (Westergaard et al. 2012).

A large part of the research that has been done on the syntactic phenomena underlying non-V2 word order in Norwegian wh-questions focuses on specific dialects. The majority of these studies focus on the dialect of Tromsø and other Northern Norwegian varieties. These accounts can be roughly divided into two groups; in earlier accounts the length of the wh-element is argued to be the main factor responsible for variation between V2 and non-V2, whilst more recent accounts focus on the central role of the functional item som ‘that’. In the Tromsø-dialect non-V2 order is accepted in simplex wh-questions (along side V2-order) but not in questions with complex wh’s. This difference in word order possibilities with simplex v. complex wh’s has been analysed in different ways. Westergaard & Vangsnes (2005) and Westergaard (2009) suggest that short or simplex wh-elements can be analysed as heads that move to the matrix V2-position (C0) that the verb would normally take.

Since the V2-requirement that C0 must be filled (Den Besten 1983) is already satisfied now, V2 word

movement is no longer triggered and non-V2 order appears. Long or complex wh’s aree considered to be phrases and cannot move to this C0-position directly, explaining the lack of non-V2 order with

these wh’s according to Westergaard & Vagnsnes (2005). Westendorp (2014) provides an explicit structural account of the movement of the wh-phrases and the differing word order possibilities for different types of wh-questions in this dialect. Building on accounts of clitic movement in Romance languages (e.g. Sportiche 1996), Westendorp shows that complex wh’s move like phrases in Tromsø Norwegian, whilst simplex wh’s can move (long-distance) as phrases as well as heads, analogous to the movement of Romance clitics. This explains the resulting optionality of V2 for this construction.

(12)

As an alternative analysis, Westendorp (2014:33) provides evidence for the possibility that Northern Norwegian simplex wh’s undergo phrasal movement and merge into C0 again removing the trigger for

the finite verb to move to this position. The optionality of simplex wh’s to either being analysed as heads and merged into C0or staying in SPECCP is linked to the dual nature of these wh’s having

properties of both phrases as well as heads. In contrast with the Tromsø dialect other dialects, such as the variety of Norwegian spoken in Nordmøre (Åfarli 1986) and some Northwestern dialects (Vangsnes 2005) do not distinguish between simplex and complex wh-elements. As dialects were found that not only had an asymmetry between simplex and complex wh’s but also showed subject/non-subject asymmetries with respect to V2 v. non-V2 order possibilities, the head v. phrase analysis had to be revised. In more recent accounts, the head v. phrase distinction has therefore been abandoned in favour of analyses where the functional element som ‘that’ plays a central role. In these analyses, the element som, originally a complementizer introducing embedded clauses is analysed as a head that can fill C0and block V2-movement. Important for this analysis is that dialects differ with

respect to treating som as a specifier or as a head (Vangsnes 2005:212). Nordgård first suggested a connection between the possibility of non-V2 word order in wh-interrogatives in a dialect and the status of the complementizer som in 1985. He found that dialects that have non-V2 in matrix wh-questions often also allow insertion of the complementizer som under the extraction of an embedded subject (1985:35, see also (Nordgård 1988)). In Standard Norwegian, no complementizer is present to introduce an embedded question from which a wh-constituent is extracted (9a). However, in certain dialects, som can be inserted as a complementizer in these contexts (9b).

9. a. Hvem tror du har gjort det?

who think you has done it

b. Hvem tror du som har gjort det? who think you COMP has done it ‘Who do you think has done it?’

Vangsnes (2005) and Westergaard et al. (2012, 2017) adopt Nordgård’s condition in their analyses of the synchronic V2/non-V2 variation, at first alongside the head v. phrase analysis but later studies relate the rise of non-V2 solely to changes in the properties of the complementizer som (Westergaard et al. 2017). Research focussing the development of non-V2 word order will be discussed in the next paragraph. Summarizing, the acceptance of non-V2 orders in wh-questions differs substantially across dialects and further depends on the type and function of the wh-element.

(13)

2.2.2 Development of non-V2 word order

In historical grammars of Norwegian (dialects), many have speculated on the origins of wh-questions with non-V2 order. In their grammar of Dano-Norwegian (precursor of Bokmål), Falk & Torp suggest that non-V2 questions actually consist of two sentences: ka (er det) du vil ‘what (is it) you want’ (1900:289). Iversen (1918:37) suggest that ellipsis causes direct questions to get the same word order as indirect or embedded questions in the dialect of Tromsø. This claim is also made for the dialect of Bodø (Fiva 1990:214). Knudsen makes the same observation and states that Central and Northern Norwegian dialects have transferred the word order of indirect questions to direct questions; he supports this claim with the observation that wh-elements in non-V2 questions must be followed by

som if they are subjects, as is usual in embedded questions (1949:68). Different syntactic accounts of

how the deviation from the generalised matrix V2 syntax developed have been proposed. One of the first proposals concerning the historical origins of non-V2 in wh-questions is written by Lie (1992). He argues that non-V2 developed from cleft sentences following the pattern in (10) below where the expletive pronominal subject in the matrix is deleted first (10b) and subsequent omission of the matrix verb leads to non-V2 order (1992:72). The phonological reduction of the construction through haplology can in this view be seen as a motor for a syntactical reanalysis eventually yielding a non-V2 order in questions no longer associated with the cleft construction.

10. a. Hå e de du si?

what is it you say

‘What is it you are saying?’

b. Hå e du si?

what is you say

c. Hå du si?

what you say

‘What are you saying?’

11. Naa kaa va du saag? (Edvard Storm 1785)

now what be.PST you see.INF ‘What did you see now?’

Lie (1992) supports his analysis with a few examples from historical dialectological research and literary texts but his proposal is mostly based on examples from his own dialect. In addition to some more recent random examples from dialectological literature (East and West Norwegian dialects), one example providing evidence for the intermediate stage represented by (10b) is a sentence from a 16th

(14)

are non-subject wh-questions; the element som does not play a role in Lie’s account of the development of non-V2 interrogatives.

Vangsnes (2005) proposes a ‘microparameter account’ of the variation in which non-V2 word order in wh-questions developed through two reanalyses that only some dialects have undergone resulting in dialectal variation with regards to word order possibilities in questions. Vangsnes singles out three parameters determining the variation; these parameters focus on whether or not interrogative C must be lexicalised and which elements may do so. The development involves the analysis of short wh-elements as heads as well as reanalysis of the complementizer som as a head. When reanalysis has taken place, all these elements may fill the V2-position and cause V2-violations (2005:207). Vangsnes makes a first attempt at determining the chain-of-events that has led to the cross-dialectal variation. Though Vangsnes raises some diachronic speculations concerning the development of the variation here, this is not the main focus of his article. Westergaard (2005) also proposes a tentative historical scenario for the development of V2 to non-V2 grammars. In her analysis two factors cause the change from V2 to non-V2: firstly, an economy principle, i.e. the ‘Head Preference principle’ (cf. Van Gelderen 2004) which proposes that it is more economical to be a head rather than a phrase, causes a historical drift towards head status of the wh-elements resulting in the occurrence of non-V2 orders. This shift makes it possible for these elements “to move into the head position that the verb normally moves to and thus prevent V2” (2005:282). Subsequently this results in a drop in frequency for the cue for verb movement, further deceasing the occurrence of V2 order. In a later article, Westergaard links the historical development to the synchronic variation, stating “synchronic microvariation reflects a diachronic development from V2 to non-V2 in Norwegian dialects” (2009:50). The connection made here between synchronic variation and diachronic change will be discussed in the next paragraph. Similar to the studies of non-V2 within specific dialects, early diachronic accounts of the non-V2 phenomenon (e.g. Westergaard 2005, 2009; Vangnes 2005; Westergaard & Vangnes 2005) all assume that the historical development starts with short wh being analysed as heads and inserted in head position otherwise attracting the finite verb. More recently, it has been proposed that the lifting of V2-requirement is due to possibility of insertion of som (Westergaard et al. 2012, Vangsnes et al. 2016). This analysis again builds on Nordgård’s observation (1988:35) that a dialect may have non-V2 word order in matrix wh-questions iff the dialect allows insertion of som under extraction of the embedded subject. Westergaard et al. (2012) put forth that the acceptability judgements on two test sentences with different complementizers from the Nordic Syntax Database (Lindstad et al. 2009) indicate that som-insertion and insertion of the complementizer at under wh-extraction are by and large in complementary distribution, geographically speaking. The order without any complementizer (9a) is accepted across Norway (Westergaard et al. 2012, 2017). The two test sentences are given in (12) on the next page.

(15)

12. a. Hvem tror du at har gjort det? (NDS #328) who think you COMP has done it

b. Hvem tror du som har gjort det? (NDS #332)

who think you COMP has done it ‘Who do you think has done it?’

In accordance with Nordgård’s observation (1985, 1988), Westergaard et al. (2012, 2017) find considerable overlap in speakers low acceptability scores for non-V2 matrix questions and disallowing insertion of som under wh-subject extraction (example (12b)) and vice versa. This is taken as confirmation for the connection between changes in the complementizer som and the emergence of non-V2 word order in wh-questions (2012:334ff). Westergaard et al. (2017:34) argue that som acquires the ‘ability to lexicalise heads in the matrix Left Periphery’, making it possible for this item to lexicalise the V2 position blocking movement of the finite verb tot this site. This change in the properties of som is argued to be the result of reduction in wh-cleft constructions mirroring Lie’s (1992) discussion of reduction in cleft sentences (example (10)):

13. a. Hå va de som skjedde? (Westergaard et al. 2017:36)

what is it SOM happened

‘What is it that happened?

b. Hå va som skjedde?

what is SOM happened

c. Hå som skjedde?

what SOM happened

‘What happened?’

In (13c), som appears in the matrix Int0 head, lexicalizing this V2 position despite it earlier only

occuring in embedded clauses. The result of this reanalysis, e.g. the change in lexicalization potential of som arguably only happens in a subset of dialects, causing word order variation across Norway.

2.2.3 Synchronic variation and diachronic change connected

Vangsnes (2005) was the first to suggest that the variation in word order possibilities across dialects reflect a series of stepwise changes. In 2012 Westergaard et al. also hypothesise that the diachronic development of non-V2 word order has led to synchronic ‘microvariation’ in Norwegian dialects (2012:338). In their 2017 article, Westergaard, Vangsnes and Lohndal present a revised and considerably extended version of the earlier proposal. All these analyses are implicitly based on the assumption that ongoing language change typically causes synchronic variation between old and new

(16)

forms (Kay 1975; Weinreich et al. 1968) as well as the idea that the geographic juxtaposition of different linguistic varieties often suggests a historical succession between these varieties. Grammatical variation visible during periods of syntactic change is thought to reflect grammatical competition (Pintzuk 2003:516); that is, the theorem “das geographische Nebeneinander des historischen Nacheinanders” (Trapolet 1905).

Westergaard, Vangsnes and Lohndal (2017) draw on elements from different previous accounts and presents the most extensive survey of the distribution of non-V2 to date. Whereas other studies are based on output or judgments obtained from a very low number of informants or other (written) sources, Westergaard et al. base their account on data from the Nordic Syntax Database (NDS) (Johannessen et al. 2010) including circa 400 informants from over 100 different locations across Norway. The loss of the V2 requirement is related to changes in the properties of the complementizer som and Westergaard et al. distinguish the following stages in the development (2016:27-8):

14. stage 0: General V2

stage 1: Non-V2 in all subject questions with short and long wh-elements stage 2: Non-V2 spreads to non-subject questions with short wh-elements stage 3a: Non-V2 spreads to non-subject questions with complex wh-elements stage 3b: Non-V2 is restricted to short wh-elements

Table 2.2

Distribution of non-V2 word order across different wh-question types in the

proposed diachronic stages

short subject wh long subject wh short non-subj. wh long non-subj. wh

stage 0 - - - -

stage 1 + + - -

stage 2 + + + -

stage 3a + + + +

stage 3b + - + -

Based on description of stages in Westergaard, Vangsnes and Lohndal (2017).

Combining notions from their earlier studies, Westergaard et al. argue that the development from V2 to non-V2 has started in simplex, subject wh-questions (note that this intermediate stage is not included in (13)) with a change in the properties of the complementizer som allowing it to lexicalise the verb-second position instead of the verb (2017:34). This order spreads to complex subject wh-questions due to economy principles and frequency issues in the acquisition process (2017:44) that have been examined before in Westergaard (2005) resulting in stage 1. In the second stage, more non-V2 input causes learners to generalise non-non-V2 word order from subject to non-subject questions, extending the lexicalization possibility to all short wh-elements. Subsequently, the lexicalization

(17)

requirement/V2-requirement is either lifted completely (stage 3a) due to the “high frequency differences between simple and complex wh-elements” making less frequent complex wh’s vulnerable to change (2017:40). Or, as an alternative, dialects develop a complexity constraint allowing non-V2 only with short wh-elements (stage 3b). Stages 3a and 3b are not connected but are both derived from stage 2. Regrettably, Westergaard, Vangsnes and Lohndal have not been able to find a syntactic motivation for this final stage (2017:41). Table 2.2 above gives an overview of which wh-questions allow non-V2 in which stages.

Figure 2.4 shows the geographical distribution of these dialect types. The synchronic dialects across Norway represent the different stages: four types of dialects with non-V2 in (some or all) wh-questions (2017:25); and a fifth type of dialect that never allows non-V2 (stage 0). One area is marked with ‘X’; according to Westergaard et al. the data available does not give a clear picture of the word order possibilities here (2017:29).

It is important to note that the proposal of this diachronic development is based solely on the pattern of synchronic variation and on sociolinguistic or historical studies. In my opinion, he methodology of this study has some significant drawbacks. The Nordic Syntax Database used by Westergaard et al. (2017) allows one to build maps, showing the judgments per location for one or more sentences and their account strongly relies on visual inspection of these NDS-maps. Even though

using the NSD provides a way to include more data than has previously been done, there are still some shortcomings to this method. A major drawback of the method used by Westergaard et al. is that individual results are not taken into account. On the maps drawn up in the NDS, judgements from several speakers are converged to a single score per location dismissing individual variation. Thus, in particular the internal hierarchical structure of the database, including speakers from different age groups is not taken into consideration. The map-building tool in the Nordic Syntax Database furthermore does not allow one to make maps for different combinations of judgements, only providing options to show either high, medium or low scores for each location but not a combination of different scores. This way, only the geographic distribution of single linguistic features can be studied. Consequently Westergaard et al. (2017) fail to acknowledge variation within different locations as well as the role of sociolinguistic factors, such as age of the speaker that may influence word order possibilities. Furthermore, when using judgement data one should always be careful to draw harsh conclusions, speaker judgements can easily be connected to a norm set by e.g. standard (written) language (K.M. Pedersen p.c., October 31, 2016; Rietveld & Van Hout 1993:224). Besides, Westergaard et al. test only four combinations of judgements (corresponding to stages 1-3b)

Figure2.4 Map of Norway

where numbers indicate contemporary dialects with grammar types representing assumed diachronic stages (Westergaard et al. 2017:29)

(18)

disregarding other possible combinations of V2/non-V2 word orders across the four types of questions in Table 2.2. Finally, no historical evidence for this developmental path is provided in the article.

Westergaard et al. motivate the transitions between the different stages in their proposal by economy principles and frequency issues in the acquisition process (2017:44). The motivation for the transition from stage 2 to stage 3b where non-V2 is allowed only in short (subject and non-subject)

wh-questions remains unclear, no syntactic motivation is found for this transition (2017:41). Since som at this stage is able to block verb-second movement in subject questions, the complexity of the

wh-element should have no effect on the presence or absence of som. It is thus surprising that only short and not long subject wh-questions allow non-V2 order at this stage.

In summary, the diachronic accounts discussed in this section all propose a diachronic development from V2 to non-V2 in Norwegian wh-questions that relies mainly on patterns found in synchronic data. Little to no historical data is provided to confirm the proposed historical account. The objective of the current study to show how the careful study of corpus and database material, accounting for individual and sociolinguistic variation, as well as historical sources can be used to confirm such a proposal and provide additional novel insights into the development and variation of non-V2 word order. In the next section I lay out my hypotheses and predictions regarding non-V2 word order in Norwegian wh-questions, based on the discussion above.

2.3 Hypotheses and predictions

Recall that synchronically, the acceptance of non-V2 orders in wh-questions differs substantially across dialects, and depends on the type and function of the wh-element. Different proposals on the origins and development of this construction have been presented in recent studies. On the basis of the literature reviewed in this section I present a proposal regarding the development of non-V2 wh-questions adopting the division in stages as proposed by Westergaard et al. (2017) in Table 2.3 on the next page. Even though I maintain the stages proposed by Westergaard et al., the proposal differs from that account in a number of ways. Firstly, stage 1a in which non-V2 orders occur with short subject wh’s only is added. This stage was described in Westergaard (2017) but not included explicitly in their account. I follow Westergaard et al. (2012, 2017) in their proposal that non-V2 word order started in subject wh-questions as research has indicated a strong link between non-V2 word order and insertion of som. Since short wh-questions are much more frequent than complex wh-questions I predict that non-V2 order first surfaced in simplex wh-questions. After som is reanalysed as an element that can lexicalise the V2-position, non-V2 can also spread to complex subject wh-questions in stage 1b. Next, non-V2 is extended first to all short questions due to the fact that simplex wh-questions are twice as frequent as complex wh-wh-questions resulting in stage 2 (Vangsnes & Westergaard 2014). In contrast with Westergaard et al. I suggest rule-reduction as a motivation for

(19)

the development of both stage 3a and 3b from stage 2. Stage 2 is arguably the most difficult to maintain as it is guided by the largest set of rules and I therefore propose it is relatively hard to maintain. In this stage, non-V2 is allowed in three types of questions: simplex and complex subject

wh-questions and short non-subject wh-questions. As complex wh-questions are relatively infrequent,

lack of input may increase uncertainty around the grammaticality of different word orders in these questions especially if speakers also need to discern between subject and non-subject wh-questions. All other stages can be explained with fewer rules, e.g. they restrict non-V2 to either one type of wh-element (stage 3b) or to only subject wh’s (stage 1) or allow non-V2 across all types of wh’s (stage 3a). Taking stage 2 as the archetype for stage 3a and 3b, the transition to these two different stages can be motivated as an generalisation of the non-V2 rule resulting in 3a (‘everything goes’) on the one hand or a simplification of the rule on the basis of prosody resulting in 3b (only simplex non-V2 questions) on the other hand.

Table 2.3

Development of non-V2 word order across different wh-question types across

proposed diachronic stages

short subject wh long subject wh short non-subj. wh long non-subj.

wh permutations connecting the stages

stage 0 - - - -

stage 1a + - - - non-V2 emerges in subject wh-questions

triggered by change in properties of som

stage 1b + + - - non-V2 spreads to all subject wh-questions

stage 2 + + + - extension to all short wh-questions due to

frequency of short wh-questions

stage 3a + + + + reduction of rules: everything goes/syntactic

generalisation

stage 3b + - + - reduction of rules: prosodic rule

The above overview of the development of non-V2 order in wh-questions forms the basis of three hypotheses central to the current study. Based on the assumption that synchronic variation often mirrors diachronic change (q.v. 2.2.3), I hypothesise that the different dialects where non-V2 word order is used can be connected in that they represent different stages of a diachronic development from V2 to non-V2.

Secondly, I hypothesise that this development started in simplex subject wh-questions triggered by a change in the lexicalization possibilities of som moving to the Left Periphery (Vangsnes 2005; Westergaard et al. 2012, 2017). Non-V2 order is most widely accepted across Norway in that question type. I hypothesise that this change happened first in Northern Norwegian dialects (Trøndelag and northwards) as non-V2 seems to have spread to the majority of question types in these dialects synchronically.

(20)

Finally, I hypothesise that the different dialect stages are connected to the each other by a minimal number of permutations as explained in Table 2.3 above. Difference in the frequency of simplex v. complex wh-questions plays a significant role in the transition from one stage to the next. The methodology of the present study can be divided into two parts. Firstly, I investigate the synchronic variation with regard to V2/non-V2 word order across Norway on the basis of acceptability judgement data from the Nordic Syntax Database (Lindstad et al. 2009). Secondly, I investigate how non-V2 order has developed and spread through the different dialects by examining historical texts. Both methods are used to test the hypotheses above. In the database material, the predictions following from the hypotheses above are:

i. Acceptability judgements on the four types of questions for different speakers correspond greatly to the combination of V2/non-V2 word orders in the different stages. If the stages are indeed connected, there will be a considerable amount of variation between these stages both within locations (between the two generations) and between neighbouring locations.

ii. Young and older speakers differ in their judgments on the acceptability of non-V2 in the synchronic data: younger speakers allow non-V2 in non-subject wh-questions (without

som) and complex wh-questions to a higher degree than older speakers as non-V2 in this

type of question is hypothesised to be a newer development.

iii. Younger speakers align with younger dialect stages (i.e. younger speakers’ judgements correspond to stage 3a/b more often than older speakers’ judgements) and older speakers to older dialect types.

For the historical source material the predictions following from the hypotheses above are:

i. Dialect material from different dates but stemming from the same location will follow the progression in dialect stages as layed out in Table 2.3.

ii. Earliest examples of non-V2 wh-questions are found in Northern Norwegian dialects specifically in short subject wh-questions.

iii. The earliest observations of dialects with non-V2 word order in wh-questions correspond to oldest dialect types in Table 2.3.

3

Methods

This section describes the two-fold research method used to test the above predictions. Before moving to a presentation of the methods used in the present research, some background of the methodology will be discussed.

(21)

3.1 Critical assumptions and prerequisites of methodology 3.1.1 Connection between synchronic variation and diachronic change

Sociolinguistic research and studies focussing on language change have shown that the synchronic and the diachronic dimension of linguistic phenomena often are related (Meyerhoff 2013:24). This assumption is twofold: synchronic variation is often the source of a linguistic change (Kay 1975; Weinreich et al. 1968); at the same time, diachronic changes often result in synchronic variation. More specifically, synchronic variation as a consequence of diachronic change may typically be found between different generations, where the language use of older generations represents an older stage of the language while younger generations show a newer stage (Labov 1994). Language acquisition plays an important role in this scenario: children learn their language from their parents but do not always acquire exactly the same grammar and/or language use as their parents. Therefore, speakers’ linguistic behaviour may always differ somewhat from that of the preceding generation. If this development proceeds through acquisition, each generation is expected to show a new stage. A critical assumption underlying this pattern is thus that each generation acquires the language during the critical period, i.e. early childhood, and that speakers generally do not change their grammars substantially later in life (Boberg 2004:256). As different generations live next to each other over times, the differences between them lead to synchronic variation. True diachronic studies have a clear practical disadvantage, as they require longitudinal data, which may take several decades to gather. Researchers who want to investigate a diachronic change may therefore instead make use of the so-called apparent-time method (Labov 1965), which is based on the observation that synchronic variation between different birth cohorts may be due to an on-going language change (Labov 1994; McMahon 1994:240) as explained above. Using this method, historical inferences about language change can thus be made on the basis of synchronous surveys of subjects of different ages, on the assumption that, other things being equal, people retain dialect and accent features in their formative years throughout their lives. Other factors may of course also factor into the variation between speakers of different age groups. That is, age grading, a stable situation with variation between generations, may also play a role in the variation (Labov 1965, 1994; Sankoff 2006). In such a case, some language features may be associated with a particular phase in life. Speakers may use some feature frequently at a younger age but abandon this feature again when they grow older. If this pattern of acquiring and dropping features remains constant through the generations, differences found between age groups are due to age grading with no diachronic change involved. Combining apparent-time data with real-time data can eliminate the effect of age grading. If historical data have already indicated that a change is going on or has been going on, it is very likely that differences between the age groups that suggest a similar change indeed are due to this diachronic change (Boberg 2004:251; McMahon 1994:240; Sankoff 2006). In the case of V2/non-V2 variation in Norwegian we have no reason to believe that the variation is motivated by age grading, research on

(22)

the choice for non-V2 or V2 word order has not uncovered any speaker specific factors guiding the variation (for discussion see 2.2.1). Still, support from true historical data can only further solidify the results of an apparent-time study.

3.1.2 Combining formal-theoretical and quantitative-statistical linguistics

In recent years, there has been a trend in linguistic research towards the presentation and examination of empirical data and away from purely theoretical analyses (cf. e.g. Meyer 2002; Gries 2006; Bloem et al. 2014 for discussion). The use of quantitative methods such as the analysis of automatically parsed corpora has been particularly successful in studies of language variation and structure (Bloem et al. 2014). Methods within dialectometry, the use of computational and quantitative techniques in dialectology (Nerbonne & Kretzschmar 2013, Van Craenenbroeck 2015), and the use of statistical mixed-effects/multi-level models (Gries 2015) have the potential to further our understanding of linguistic variation. These methods bridge the gap between formal-theoretical and quantitative-statistical linguistics and allow for the detection and identification of grammatical parameters in large and highly varied data sets. This is also the case for the computerization of language mapping (for discussion see Kehrein et al. 2010); data-analysis methods such as multi-dimension scaling (Embleton 1987) and the implementation of Geographical Information Systems (GIS) (e.g. Kretzschmar & Light 1996) let us better visualise linguistic variation. Together, these quantitative methods allow us to identify different dialect areas and linguistic configurations of phenomena before theoretical description of the linguistic features comes into the analysis. In my opinion, use of such methods can also further the research on word order variation in Norwegian wh-questions. For example, in Westergaard et al.’s analysis of the data in the Nordic Syntax Database (Lindstad et al. 2009) acceptability judgement scores for a number of speakers are converged per location which is methodologically unfortunate and might be remedied with a more data-driven approach exploiting all quantitative data in the database. Using computational and quantitative techniques such as those mentioned above, may further the study of this phenomenon.

The current study is two-fold in that both synchronic distribution as well as historical development of V2/non-V2 word order variation is examined. The remainder of this section will accordingly focus on two sets of methods used to examine these issues.

3.2 Examining synchronic distribution 3.2.1 Data collection

This study uses acceptability judgement data form the Nordic Syntax Database, also used by Westergaard et al. (2017). The database consists of acceptability judgments of 924 Nordic dialect speakers from over 200 locations; these speakers judged a large number of sentences on a scale from 1 (‘bad’) to 5 (‘good’) (Lindstad et al. 2009). The sentences in the questionnaire can be sorted and

(23)

filtered in many ways according to sociolinguistic variables or type of syntactic phenomenon included. The data used in this study is a subset of the Nordic Syntax Database, limiting it to only Norwegian speakers. The subset includes judgements from 409 speakers from 105 locations across Norway. The subset is further limited to the four types of wh-questions central to this study (see Table 3.1), resulting in a total of 1.580 judgement scores.

Table 3.1

Overview of questions used

Question type Question text NDS reference number

short subject wh Hvem SOM selger fiskeutstyr her i bygda, da?

who COMP sells fishing gear here in town, then #17

long subject wh [Hvor mange elever] SOM går på denne skolen?

howe many students that go to this school #1228 short non-subject wh Hva du heter?

what you called #988

long non-subject wh [Når tid] du gjekk ut av ungdomsskolen a?

what time you went out of middle school then #33

The dataset used includes judgements from both men and women (200 v. 209 respectively) and includes both young (15-30 years old) and older (50+ years) speakers (205 v. 204 speakers). Versions of the test sentences with verb-second word order were not included in the questionnaire; consequently, no information on the acceptability of V2 in these specific questions is available3.

The Nordic Dialect Corpus (NDC) (Johannessen et al. 2009, 2010) is used to examine the usage frequency of the different wh-questions across Norway. NDC is a corpus of Norwegian, Swedish, Danish, Faroese, Icelandic and Elfdalian spoken language and contains spontaneous speech data from dialects across Scandinavia. The corpus consists of conversations and interviews with dialect speakers and comprises about 2,8 million words. All recordings in the corpus are orthographically transcribed both in the dialect as well as in Standard Norwegian and are additionally annotated for a variety of phonological and syntactical features.

3.2.2 Coding NSD-data

To better assess the acceptability of the different wh-questions in the data set, judgement scores were converted to dichotomous scores; low (‘1’ and ‘2’) scores were converted to ‘0’ (not accepted by speaker) and medium and high scores (‘3’, ‘4’, ‘5’) to ‘1’ (accepted). Based on the four wh-questions selected from the Nordic Syntax Database and the binary coding, one can logically infer 16 different combinations of scores. Due to some missing values, such a combination of scores across all four questions could be computed for 373 of the 409 speakers. For convenience reasons, Table 2.2 that

(24)

shows the acceptability of non-V2 in different wh-questions across the dialect stages proposed by Westergaard et al. (2017) is repeated here as Table 3.2 with the binary coding that will be used from here on out.

Table 3.2

Acceptance of non-V2 word order across different wh-question types

short subject wh #17 long subject wh #1228 short non-subj. wh #988 long non-subj. wh #33 stage 0 0 0 0 0 stage 1 1 1 0 0 stage 2 1 1 1 0 stage 3a 1 1 1 1 stage 3b 1 0 1 0

To confirm the feasibility of converting the acceptability scores ranking from ‘1’ bad to ‘5’ good to a binary system, the distribution of the scores across this scale was examined. The distribution of the scores across the four items was bimodal to such an extent that it was judged to be reasonably representative to read scores ‘1’ and ‘2’ as ‘not accepted’ and scores ‘3’ and up as ‘accepted’. Figure 3.1 shows the distribution of scores across the four test questions.

Figure 3.1 Frequency distribution of scores on 4 test items 3.2.3 Data analysis

The grammaticality judgement data available in the database is analysed to further unravel the variation and distribution of non-V2 word orders across Norway. To do so, the judgement data is split by different sociolinguistic and dialectological factors a.o. gender, age of the speaker, region and travel time from the capital Oslo. Speaker specific metadata is available within the database; travel time to each location was computed by using Google Maps. The apparent-time method (Labov 1965, 1994) is used to uncover transitions between the different proposed developmental stages. In addition to the apparent-time method, several different techniques will be used to map the variation across Norway. The web application Gabmap (Nerbonne et al. 2011) is used to examine and map linguistic differences between locations based on all the variables in the data set. Within the same application, cluster analysis (cf. Prokić & Nerbonne 2008) is carried out to classify dialects and

(25)

identifying dialect areas. Viz., geographic places are divided into clusters based on their linguistic similarity.

To find relevant examples of non-V2 wh-questions in the Nordic Dialect Corpus (Johannessen et al. 2009, 2010), the corpus was searched using a combination of search methods. Unfortunately,

wh-elements are not coded as a separate word class in the corpus; therefore, each wh-element must

be searched individually. An overview of the search method is given in Table 3.3.

Table 3.3

Overview of search methods applied in Nordic Dialect Corpus

Question type Example of search query

short subject wh < segment initial “hva” > [0,0] < “som” > [0,0] < verb >

< pause > [0,0] < “hva” > [0,0] < “som” > [0,0] < verb > long subject wh < segment initial “hvordan” > [0,0] < “som” > [0,0] < verb >

< pause > [0,0] < “hva” > [0,0] < “som” > [0,0] < verb >

short non-subject wh < segment initial “hva” > [0,0] < pron:det or pron > [0,0] < verb >

< pause > [0,0] < “hva” > [0,0] < pron:det or pron > [0,0] < verb >

long non-subject wh < segment initial, start of word “hv…” > [0,2] < noun > [0,0] < pron:det or pron > [0,0] < verb >

< pause > [0,0] < start of word “hv…”> [0,0] ] < noun > [0,0] < pron:det or pron > [0,0] < verb >

As the majority of wh-items in Standard Norwegian start with the letter combination hv-, this sequence was used to find most of the relevant examples. For each type of question a number of different search queries were used to find examples with non-V2 order. To eliminate embedded questions, which always have non-V2 order, the search was limited to wh-elements coded as ‘segment-initial’ and elements occurring after a pause. These limitations have undoubtedly caused the loss of some relevant results, though I estimate that this has not had any major impact on the relative frequency of the different question types. Complex wh-elements such as e.g. hvor mange

elever ‘how many students’ were found by looking at multiple-word sequences starting with hv-

followed by a noun, and searching for the complex wh’s hvordan ‘how’, hvorfor ‘why’, når ‘when’ and

hvilken ‘which’.

The combination of search queries yielded a total of 880 examples; after non-main clause sequences were manually excluded 331 relevant results were left.

3.3 Studying diachronic development 3.3.1 Data collection

Grammars of Norwegian language and dialects (and its predecessors), as well as descriptions of Norwegian dialects and other Scandinavian dialectological literature are consulted as a source of historical data on the emergence and distribution of non-V2 word order in wh-interrogatives. The selection of material was based on convenience, combining sources available through the University

(26)

of Amsterdam, Leiden University and the University of Copenhagen4. Only sources predating 1950 are

included, as the older speakers in the NSD will already cover more recent language variation. To further expand the sample, letters and fiction from this time period were included, focussing specifically on sources from Northern Norway.

3.3.2 Data analysis

Historical material was manually searched for examples of main clause wh-questions, both with verb-second and with non-V2 word order. In dialect descriptions sections on interrogative pronouns and adverbs as well as sections on sentence structure and word order were explored specifically. For all examples, the location and date were recorded as well as whether non-V2 was allowed and in which contexts (subject v. non-subject questions and simplex v. complex wh). For each location where examples of non-V2 wh-questions were found, the same coding as for the synchronic material was used to provide an overview of the different combinations of non-V2/V2 orders within the dialects.

4

Results

4.1 Presentation and interpretation of the quantitative data

In this section the results of the analysis of the data in the Nordic Dialect Corpus (NDC) and Syntax Database (NSD) (Johannessen et al. 2010) are discussed. Here, I test the hypotheses outlined in section 2.3 concerning the word order variation in wh-questions across Norway and the development of this word order.

4.1.1 Nordic Syntax Database

Firstly, I predicted that the acceptability judgements on the four types of questions for different speakers correspond greatly to the combination of V2/non-V2 word orders in the different stages proposed in section 2.3 and that theses stages are connected. Consequently, variation both within locations (between the two generations) and between neighbouring locations is expected.

To test this prediction, the judgements of 409 Norwegian speakers in the NSD were converted to a code based on the combination of acceptability scores the speakers assigned the four non-V2 wh-questions (Table 3.1). For example, if a speaker accepts (score ‘3’ or higher) only subject non-V2 interrogatives but not non-subject non-V2 interrogatives, this speaker gets the code ‘1100’. Table 4.1 on the next page provides an overview of the possible code combinations and the distribution across

4 I would like to thank Nordisk Forskningsinstitut of the University of Copenhagen for accommodating me as I

spend three weeks in October 2016 roaming their library for the bulk of these source materials. I am especially grateful to Pia Quist, Karen Margrethe Pedersen, Inge Lise Pedersen and prof. Tore Kristiansen for our inspiring discussions on the topic of non-V2 word order in Scandinavian.

(27)

speakers in the dataset. For convenience, it is also indicated what codes correspond to which dialect stage as proposed by Westergaard et al. (2017).

Table 4.1

Frequency and relative percentage for different combinations of judgements

Figure 4.1 Frequency of use of different combinations of judgements on 4 non-V2 questions

Figure 4.1 above provides a graphical overview of the distribution in Table 4.1 (light grey bars). Apart from the combination ‘0000’ where only V2 order is accepted, three combinations of judgements are very frequent: ‘1010’, ‘1110’ and ‘1111’. These three types correspond to resp. the dialect stages 3b, 2 and 3a in Westergaard et al. (2017). Combination ‘1100’, corresponding to the proposed dialect stage 1, does not stand out clearly in the above distribution. This might point to the

82 4 11 4 8 6 1 3 16 0 72 11 16 4 51 84 82 4 9 4 7 6 0 2 11 0 65 8 13 4 26 48 0 50 100 0000 0001 0010 0011 0100 0101 0110 0111 1000 1001 1010 1011 1100 1101 1110 1111 Fr equ en cy

with medium '3' scores without medium scores

Combination Corresponding dialect in WVL (2017) Frequency Percentage

0000 type 0 82 20,0 0001 4 1,0 0010 11 2,7 0011 4 1,0 0100 8 2,0 0101 6 1,5 0110 1 0,2 0111 3 0,7 1000 16 3,9 1001 0 0,0 1010 type 3b 72 17,6 1011 11 2,7 1100 type 1 16 3,9 1101 4 1,0 1110 type 2 51 12,5 1111 type 3a 84 20,5 incomplete 36 8,8 Total 409 100,0

(28)

overestimation of dialect stage 1 by Westergaard et al. (2017). Furthermore, combination ‘1000’ is equally frequent as ‘1100’, indicating that this might be considered a first step in the development of stage 1 (‘1100’) as hypothesised earlier. That is, ‘1000’ can be seen as the intermediate ‘stage 1a’ which develops into ‘1100’/‘stage 1b’. The most frequent combinations are ‘0000’, corresponding to type 0 in Westergaard et al. (2017) where non-V2 order is never accepted and the mirror image of this type, combination ‘1111’/type 3a. Other combinations that easily derived from the ‘0000’-standard are ‘1010’ where non-V2 is only accepted with short wh’s; ‘1100’ which allows non-V2 only in subject

wh-questions; ‘0101’ allowing only long wh’s to have non-V2 and ‘0011’ which allows non-V2 with

non-subject wh’s. Of these, only the first combination ‘1010’ (stage 3b) is very frequent (72 instances). The maximally different types ‘0110’ (non-V2 in long subject and short non-subject wh-questions) and its mirror ‘1001’, combinations that require the largest set of rules to be learned are as expected the least frequent.

In an attempt to minimise noise in the distribution, combinations containing medium judgements (score ‘3’) were removed before calculating the combination frequencies again (Figure 4.1, dark grey bars). Chi-square analysis of the resulting distribution shows that it is not significantly different from the original (X2 (14) = 10.9876, p = 0.687). Unexpectedly, the biggest differences

between these two distributions are found not in the infrequent combinations (such as ‘1000’ or ‘1100’) but in the combinations that allow non-V2 in most or all wh-questions (i.e. ‘1110’ and ‘1111’). The relative frequency between the two distributions (with v. without medium score) for both of these combinations was significant (‘1110’: X2 (1) = 8.1169, p < 0.01 (49% decrease) and ‘1111’: X2 (1) =

9.8182, p < 0.01 (43% decrease)). The majority of medium scores for the combination ‘1110’ are found for complex subject wh-questions (Figure 4.2a) and in complex non-subject questions for ‘1111’-speakers (Figure 4.2b). These high medium scores fit with the documented low frequency of complex

wh-questions; lack of input might make speakers insecure about the acceptability of complex non-V2

interrogatives. This uncertainty of dialect ‘1110’-speakers specifically facilitates them acquiring the ‘1010’-dialect if their acceptance of complex subject wh-questions drops.

Referenties

GERELATEERDE DOCUMENTEN

Focus op eigen bedrijf Investeringsruimte door aanhoudend laag rendement Het enige knelpunt voor het ontwikkelen van de bedrijfsvoering op sectorniveau is de.

The questionnaire contained questions about national standards and/or criteria for US'lilg safety barriers, specIfications of construction types, presence of safety barriers with a

We have developed a so-called Master Production Scheduling (MPS) rule for the production of subassemblies, which served as the basis for a computer- based Materials

This is to confirm that the Faculty of ICT’s Research and innovation committee has decided to grant you ethical status on the above projects.. All evidence provided was sufficient

This research is based on two types of real estate indices, a private market-based transaction index and a public REIT index.. In this chapter first the private

Football has changed from a popular sport into a global industry, but its regulatory structure has not yet caught up with these changes.. The football

Immigrants and descendants of immigrants accounted for 25 per cent of the researchers at Norwegian higher education institutions, research institutes and health trusts

The file norsk.dtx 1 defines all the language definition macros for the Norwegian language as well as for an alternative variant ‘nynorsk’ of this language.. For this language