Mapping the dimensions of linguistic distance: A study on South Ethiosemitic languages

(1)

University of Groningen

Mapping the dimensions of linguistic distance

Feleke, Tekabe Legesse ; Gooskens, Charlotte; Rabanus, Stefan

Published in:

Lingua

DOI:

10.1016/j.lingua.2020.102893

IMPORTANT NOTE: You are advised to consult the publisher's version (publisher's PDF) if you wish to cite from

it. Please check the document version below.

Document Version

Publisher's PDF, also known as Version of record

Publication date:

2020

Link to publication in University of Groningen/UMCG research database

Citation for published version (APA):

Feleke, T. L., Gooskens, C., & Rabanus, S. (2020). Mapping the dimensions of linguistic distance: A study

on South Ethiosemitic languages. Lingua, 243, [102893]. https://doi.org/10.1016/j.lingua.2020.102893

Copyright

Other than for strictly personal use, it is not permitted to download or to forward/distribute the text or part of it without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license (like Creative Commons).

Take-down policy

If you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediately and investigate your claim.

Downloaded from the University of Groningen/UMCG research database (Pure): http://www.rug.nl/research/portal. For technical reasons the number of authors shown on this cover page is limited to 10 maximum.

(2)

Mapping

the

dimensions

of

linguistic

distance:

A

study

on

South

Ethiosemitic

languages

Tekabe

Legesse

Feleke

a,

*

,

Charlotte

Gooskens

b

,

Stefan

Rabanus

a a

DepartmentofLinguistics,VeronaUniversity,Italy b_Department_of_Linguistics,_University_of_Groningen,_The_Netherlands

Received3January2020;receivedinrevisedform21April2020;accepted22April2020 Availableonline

Abstract

We measured the distance among selected South Ethiosemitic languages fromthree dimensions: structural, functional and perceptual.Themainobjectivesofourstudywastodeterminetherelationshipamongthesethreedimensionsoflinguisticdistances, tore-examinepreviousclassificationsofthelanguages,andtomeasurethedegreeofmutualintelligibilityamongthelanguagevarieties. Wedeterminedthestructuraldistancebycomputingthelexicalandphoneticdifferences.Thephoneticdistancewascomputedusingthe Levenshteinalgorithm.AWordCategorizationtestwasadoptedfromTangandvanHeuven(2009)tomeasurethefunctionaldistanceor thedegreeofintelligibility.Aself-ratingtest,basedontherecordingsof‘TheNorthWindandtheSun’wasadministeredtomeasurethe perceptualdistanceamongthelanguages.ThenweperformedaclusteranalysisusingGabmap.Multidimensionalscalingwasemployed fortheclustervalidation.TheresultsofourstudyshowthattheselectedSouthEthiosemiticlanguagescanbeclassifiedintofivegroups: {Chaha,Ezha,Gura,Gumer},{Endegagn,Inor},{Muher,Mesqan},{Kistane}and{Silt’e}.Moreover,thelanguagetaxonomiesobtained fromthemeasuresofthe threedimensionsofdistancearevery similar,andtheyare generallycomparabletothe classifications previouslyproposedbyhistoricallinguists.Thereisalsoaverystrongcorrelationamongthethreedimensionsofdistance.Furthermore, theWordCategorizationtestresultsshowthatthemajorityoftheselanguagesaremutuallyintelligiblewiththeexceptionofSilt’e. ©2020ElsevierB.V.Allrightsreserved.

Keywords: Ethiosemiticlanguages;Linguisticdistance;Combinedapproach

1. Introduction

Issuesofhowtodistinguishdialects fromlanguagesandhowtoquantifythe resemblancebetween twoormore languagevarietieshavebeenthecentralconcernsofdialectology.Thesetwosubjectsareoftenaddressedbymeasuring thedistancebetweentwoormorelanguagevarieties.Asageneralprinciple,themoretwolanguagesarestructurally similar(i.e.,phonetically,morphologically,lexicallyorsyntactically),themoretheyarerelatedtoeachother;iftheyare similarenough,theyaredialectsofthesamelanguage.However,distinguishingdialectsfromlanguagesismorecomplex because in most cases non-linguistic variables (social, cultural and political) have roles to play. This means that computinglinguisticdistancesolelybasedonthestructuralsimilaritybetweenlanguagesmaynotalwaysbesufficientto determinewhethertwolanguagevarietiesshouldbeconsideredasdialectsofalanguageortwodifferentlanguages.In

www.elsevier.com/locate/lingua

Availableonlineatwww.sciencedirect.com

ScienceDirect

Lingua243(2020)xxx

*Correspondingauthor.

E-mailaddresses:tekabelegesse.feleke@univr.it(T.L.Feleke),c.s.gooskens@rug.nl(C.Gooskens),stefan.rabanus@univr.it(S.Rabanus).

https://doi.org/10.1016/j.lingua.2020.102893

(3)

additiontotheinfluencesofthenon-linguisticvariables,thereareinherentlimitationsofthestructure-basedtraditional approach.Thestructuralapproachisoftencriticizedforhavingtwodrawbacks.First,measuringthelinguisticdistance requires quantifying the distance amongthe language varieties. However, languages differ on several dimensions (phonology,phonetics,morphologyandsyntax)andidentifyingtheleveltobemeasuredisamajorchallenge(Gooskens, 2018,p.206;Heeringaetal.,2006,p.51;TangandvanHeuven,2007,p.223;TangandvanHeuven,2009,p.710). Second,evenifallthelevelscouldbemeasured,determiningtherelativecontributionofeachlevel,andcombiningthe differencesintoasinglemathematicalmeasurement,isanotherchallenge(ChiswickandMiller,2005,p.1).

Previousstudiesofdialectologyhavegenerallyfollowedtworesearchperspectivestoaddresstheaforementioned limitations.Ontheonehand,therehasbeenasuccessfulmoveintermsofshiftingfrommeasuringlinguisticdistance solelybasedonpurposefullyselectedspecificlinguisticfeaturestomeasuringdistancebasedonlargeaggregatedata (Goebl,2010;NerbonneandHeeringa,2001;Nerbonneetal.,2011;Prokićetal.,2013).Ontheotherhand,different methodsthat takeinto account the non-linguisticvariables,for example,the perception and the knowledgeof non-linguists,havebeendevelopedinrecentdecadestocircumventthelimitationsofthestructure-basedapproach(e.g.,

Preston,2010).Inthisregard,theuseofintelligibilityasameansofmeasuringlinguisticdistanceandrecentadvancesin folk linguistics have made important contributions. As a part of these endeavors, different methods of measuring intelligibilityhavebeendeveloped(seeCasad,1987;Gooskens,2013;Menuta,2013,pp.57--58;VoegelinandHarris, 1951).

Therehavealsobeenvariousmethodsofmeasuringlinguisticdistancefromperceptualperspectives.The perception-basedapproachesvaryinthefollowingways.Someofthemexaminetheperceptionofthespeakersbasedoncarefully selectedlanguage inputs, such as recorded stories (e.g., Beijeringet al., 2008);some others measurethe overall perceptionofthespeakerswithoutfocusingonaspecificlanguageinput,forexample,byaskinginwhichnearbyareaa similarlanguageisspoken(e.g.,Bucholtzetal.,2007;Pearce,2009;Tamasi,2003;Montgomery,2007;Preston,1996). Moreover,somerecent studieshavefocusedonexaminingthe perceptionof non-linguistsregardingspecificsound features,suchasthefeaturesofvowelsorconsonants(e.g.,Labov,2001;PlichtaandPreston,2005;Niedzielski,1999). Therefore, dialectologists have taken different paths in attempting to better quantify the distance among related languages, leading to a substantialincrease in methods for measuring linguistic distance. These methods canbe subsumedintothreebroadcategories:structure-based(basedonphonetic,lexicalorgrammaticalsimilarity), intelligibility-based(basedoninherentandacquiredintelligibility)andperception-based(basedontheperceptionofnativespeakers). Previousstudies measuredlinguisticdistanceeither from oneorfrom the combinationsof thesethreeperspectives (Casad,1987,pp.89--98;Gooskens,2018,p.196;Grimes,1990;TangandvanHeuven,2009,p.710;Tangandvan

Heuven,2007,p.223;Tangand VanHeuven).Asnoted byGooskens(2018),the degree ofcorrelationamongthe

linguisticdistancesmeasuredfromeachoftheseperspectivesisaconcernthatmeritsfurtherexploration.

Inthepresentstudywefurtherinvestigatethismatter.Forthesakeofexpediency,weusefunctionaldistanceand intelligibilitywithslightdifferencesinmeaning.Weadoptthecommondefinitionofintelligibility,whichisthedegreeto whichspeechvarietyAisunderstoodbythespeakersofspeechvarietyBonthebasisofthelinguisticsimilaritybetweenA andB(Guut,1980,p.57).WedefinefunctionaldistanceasthedegreeofdifferencebetweenlanguageAandlanguageB onthebasisofthespeakers’degreeofunderstanding.Thisdistinctionisimportantforsomelogicalreasons.First,in literature, very often a distinction is made between inherent intelligibility and acquired intelligibility (Gooskens and Heuven,2018,p.2;Gutt,1980,p.57).Insomecases,onlyinherentintelligibilityisconsideredasmutualintelligibility(e.g.,

Gutt,1980;TangandvanHeuven,2009).Weusefunctionaldistancetorefertoalinguisticdistancethatismeasured usingeitherinherentintelligibilityoracquiredintelligibilitytestsorboth.Second,bothinherentintelligibilityandacquired intelligibilityarepartsofactualcommunication--whichisthemainfunctionofthelanguage.Hence,functionaldistance (function-baseddistance)canbestdescribealldistancesmeasuredfromthisperspective.Moreimportantly,byusing functionaldistance,wemakeadistinctionbetweenintelligibility,whichismeasuredbasedontheactualperformance,and perceivedintelligibility,whichismeasuredbasedonthesubjectivejudgementofthenativespeakers.Basedonthese considerations,weclassifymethodsofmeasuringlinguisticdistanceingeneralasstructure-based,function-basedand perception-based.The distances that are determined using these methods are therefore considered as structural, functionalandperceptualdistancesrespectively.

Byexaminingthesethreedistances,wecontributetooneofthecontinuingdebatesindialectology,thatis,towhat extent thesedimensions ofdistance correlate. In previous works, there havebeen doubts,for example,about the reliabilityoftheawarenessofnon-linguistswhenmeasuringlinguisticdistance(Goeman,1999,p.141).Thecorrelation betweenintelligibilityanddegreeoflinguisticsimilarityhasalsobeentheconcernofseveralrecentstudies(Gooskens,

2018;GooskensandHeuven,2018;Gooskensetal.,2010).Thepresentstudypartlyaddressestheseconcerns,and

examines themin the context ofEthiosemitic languages. In addition to examiningthe relationship amongdifferent perspectivesofmeasuringlinguisticdistance,wealsoaimtodeterminethedistanceanddegreeofintelligibilityamong selectedSouthEthiosemiticlanguages--Chaha,Inor,Ezha,Endegagn,Gura,Gumer,Mesqan,Muher,Kistaneand Silt’e.Theselanguageswereselectedonthebasisoftwoparameters:thenumberofspeakersofeachvarietyandthe

(4)

languagesub-familythevarietiesbelongto.Giventhatwewantedtoincludeahighnumberofparticipants,language varietieswitharelativelyhighnumberofspeakerswereselectedbasedontheEthiopianNationalCensusReport(ENSA, 2007).Wetookthenumbersofresidentsoftheethnicallydefineddistricts(e.g.,EzhaWereda,totalpopulation84,905)as thenumbersofspeakersofthevarieties.Wealsosoughttoincludeatleastonelanguagefromeachofthefiveso-called Guragevarieties:Kistane(NorthGurage),MuherandMesqan(WestGurage),Silt’e(EastGurage),EndegagnandInor (PeripheralWestGurage)andGura,Gumer,ChahaandEzha(CentralWestGurage).

2. Ethiosemiticlanguages

EthiosemiticlanguagesareSemiticvarietiesthatarespokeninEthiopiaandEritrea.Scholarshaveproposeddifferent classifications oftheselanguages. Thepresent studylargely reliesonthe classificationofHetzron(1972),which is consideredasbeingmostcompleteintermsofthenumberoflanguagesincluded.Ethiosemiticlanguagesaredividedinto NorthandSouthEthiosemitic.TheNorthEthiosemiticlanguagesconsistofTigre,TigrignaandGe’ez(seeDemeke,2001, p.2;Hetxron,1972,p.3).TheSouthEthiosemiticlanguagesincludeseverallanguages(seeFig.1).Languagesclassified underthe‘OuterSouth’and‘Eastern’branchesaretraditionallycalledGuragelanguages.AccordingtoDemeke(2001,p. 61),Fleming(1968,p.354)andFaber(1997,pp.3--4),thereisnocleargenealogicalrelationshipamongalloftheGurage varietiesthatconstitutealargenumberoftheSouthEthiosemiticlanguages.Forinstance,Silt’eisclosertoHararithanto therestoftheGuragelanguages.Furthermore,KistaneisclosertoGafatthantootherGuragelanguages.Thereisalsoa controversyaboutthepositionofMesqan.Hetzron(1972)andHetzron(1977)classifieditunderWestGuragewhileother scholars,suchasDemeke(2001),classifieditunderNorthGurage.Moreover,Muherdoesnothaveasettledpositionin theclassificationofEthiosemiticlanguages.WhileHetzron(1972)classifieditunderthett-Group,Demeke(2001)placed itunderCentralWest Gurage.Neitherofthe studiesprovideda sufficient descriptionfor theirclassification.Lack of detailed evidence, combined with other factors, such as a long history of contact among Ethiosemitic and other neighboringAfro-asiaticlanguages,compelledpreviousstudiestomakeoftenunsatisfactoryconclusionsregardingthe

(5)

origin and the classificationof the languages(Goldenberg, 1977, p.462).So far, there isno singleclear proposal regardingtheoriginandtheclassificationofEthiosemiticlanguages(Demeke,2001,p.61;HetzronandBender,1976,p. 5;Hudson,2000,pp.75--76;Goldenberg,1977,p.461;Leslau,1969,p.97;Leslau,1992,p.12).

TherearenostudiesconcerningphoneticandperceptualdistancesamongEthiosemiticlanguages.However,there arestudiesonlexicalcomparisons.Forinstance,Benderetal.(1972)examined12Ethiosemiticlanguagesusinga 98-wordslistfrom Swadesh(1955).Inor,Chaha,Mesqanand Kistaneareamongthelanguagesincludedinthisstudy. AccordingtoBenderetal.(1972),noneoftheselanguagessharemorethan80%ofcognates.Bender(1971)alsomadea lexicalcomparisonamongseveralEthiopianlanguages.Thestudyadopteda 98-wordlistfromSwadesh(1950)and found out that none of the compared Ethiosemitic languages(Amharic, Argoba,Harari, zay,Wolane, Inor, Chaha, Mesqan,Kistane,Gyeto,Ge’ezandTigrigna)sharedmorethan80%ofcognates.Inthesamemanner,Hudson(2013)

investigatedthelexicalsimilarityamong14Ethiosemiticlanguagesbasedona250-wordlist.Silt’e,Inor,Chaha,Muher, MesqanandKistaneareamongthelanguagesinvestigatedbythisstudy.Thestudyreportedmorethan80%shared cognatesbetween InorandMesqan,Inorand Muher,Inorand Chaha,Chahaand Muher,Muherand Mesqan,and MesqanandKistane.Likewise,usingalistof255words,Menuta(2015)examinedthelexicalsimilarityamongsixSouth Ethiosemsticlanguages: Kistane,Chaha, Inor, Mesqan,Muher andWolane. Thestudy reported morethan 80%of cognateswere sharedbetweenChaha and Inor, Chaha andMesqan, Chahaand Muher, Mesqan andChaha, and MesqanandMuher.

The degree of intelligibility among manyof the languageshas not been investigated either. Tothe best of our knowledge, there are three studies that have investigated the degree of intelligibility among South Ethiosemsitic languages:Gutt (1980),Ahland(2003)and Menuta (2015).Gutt(1980) examinedthe intelligibilityamong sixSouth Ethiosemiticlanguagevarieties,Silt’e,Kistane,Chaha,Inor,MesqanandAmharic,usinganoralcomprehensiontask. Theresultsofthestudyindicatethat,basedonthe80%intelligibilitythreshold,onlySilt’eandMesqanareintelligible.In thesamemanner,Ahland(2003)determined intelligibilityamongelevenGurage varietiesusingoralcomprehension questions.AccordingtoAhland(2003),basedonan80%intelligibilitythreshold,ChahaisintelligibletoEzha,Muherand Gumer; Ezhaisintelligible to Gumer; Inor isintelligible to Endegagn;Gumer isintelligible to Ezhaand Endegagn; Endegagnis intelligible toInor; Mesqan isintelligible to Chaha, Ezhaand Muher. Menuta (2015) alsoinvestigated intelligibilityamongsixGuragevarieties(Kistane,Mesqan,Inor,Chaha,MuherandWolane).Inthisstudy,differenttests wereusedto measureintelligibility: wordrecognition(words in differentpartsof sentenceswererecognized by the respondents),sentencerepetition(theinformantslistenedtovarioussentencesandwrotedownexactlywhattheyheard), sentenceverification(theinformantsjudgedsentencesthatarehabituallytruebysaying‘true’or‘false’),instruction(the respondentsperformedcertainactionsbasedongiveninstructions)andcomprehensionquestions.Basedonthe80% intelligibilitythreshold,thisstudyreportedintelligibilitybetweenChahaandInor,ChahaandMesqan,InorandMesqan, MesqanandKistane,MuherandChaha,andMuherandMesqan.

Withregardtothegeographicaldistributionofthelanguages,Ethiosemiticlanguagesingeneralarespokeninthe north,central,eastandsouthwestofEthiopia.The10languagesweinvestigatedinthepresentstudyarespokeninthe southwestpartofEthiopia(seeFig.2),around160kmfromAddisAbaba,thecapital.Thissmallareaissometimescalled theGuragearea.ItisoneofthemostlinguisticallydiverseareasinEthiopia.Morethan12Ethiosemiticvarietiesare spokeninthisarea.Weadoptedtheterm‘Guragelanguagearea’and‘Guragelanguages’fromearlierworks(e.g.,Leslau, 1979).However,itisimportanttomentionherethattheso-calledGuragelanguagesdonotrefertoasinglegenetically attestedunit(Hetzron,1972,p.119;Meyer,2011,p.1221)and,moreimportantly,someofthespeakersofthesevarieties donotconsiderthemselvesasGurage(Meyer,2011,p.1223).Silt’eistaughtatelementarylevelinSilt’eZonewhilethe remainingvarietiesarejustspokenlanguages.

GiventhattherehavebeendebatesconcerningboththemethodsofdialectologyandtheclassificationofEthiosemitic languages,thepresentstudyaimstoaddresstwogeneralobjectives.Thefirstoneismethodological,thatis,towhat extentthemethodsofmeasuringlinguisticdistancearerelated.Therearetwospecificobjectivesrelatedtothemethods: (a)determiningtowhatextentstructural,functionalandperceptualdistancescorrelate;(b)examiningthepossibilityof substitutabilityamongthethreedimensionsofdistance.Byaddressingtheseobjectives,weillustratethelinkamong variousmethodsofmeasuringlinguisticdistance.Weexpectstrongcorrelationsamongthethreedimensionsofdistance basedonpreviousstudies(e.g.,Beijeringetal.,2008;Casad,1987;Grimes,1990;GooskensandHeeringa,2004;Tang

andvanHeuven,2007;TangandvanHeuven,2009;VanBezooijenandGooskens,2007).

ThesecondgeneralobjectiveisdeterminingthelinguisticdistanceamongtheselectedSouthEthiosemiticlanguage varieties.WeaimtoaddressfourspecificobjectivesrelatedtotheEthiosemiticlanguagevarieties:(a)determiningthe distanceamongtheselectedlanguagevarieties;(b)classifyingthelanguagesusingthedataobtainedfromthethree dimensionsofdistance;(c)examiningtowhatextentthetaxonomiesobtainedfromstructural,functionalandperceptual distancemeasuresaresimilartotheclassificationspreviouslyproposedbyhistoricallinguists;(d)determiningthedegree ofintelligibilityamongthelanguagevarieties.BasedonHudson(2013)andMenuta(2015),weexpectverycloselexical similaritybetweenChahaandMesqan,ChahaandInor,MesqanandMuher,andMesqanandKistane.Furthermore,we

(6)

expectclosesimilaritybetweentaxonomiesobtainedfromthethreedistancemeasures,aswellasfromtheclassifications previouslyprovidedbyhistoricallinguists,basedonTangandvanHeuven(2009).BasedonAhland(2003)andMenuta

(2015),wefurtherexpectintelligibilitybetweenChahaandEzha,ChahaandMuher,ChahaandGumer,Mesqanand

Chaha,andMesqanandEzha.

3. Methods

ThissectionpresentsthemethodsemployedtoaddresstheobjectivespresentedinSection1.First,thedescriptionof theresearchassistantsandexperimentalsubjectsispresented.Thiswillbefollowedbythemethodsandprocedures usedtomeasurestructuraldistance.Then,methodsusedtodeterminefunctionalandperceptualdistanceareexplained. Thisisfollowedbyapresentationofthemethodsofclusteringandclustervalidationtechniques.

3.1. Researchassistantsandexperimentalsubjects

Inthisstudy,theterm‘researchassistants’and‘experimentalsubjects’areusedwithdifferentmeanings.Research assistantswereschoolteacherswhoparticipatedinselectingtest-takers,preparingmaterials,suchastranslatingtexts, and readingtranslated textsduringthe recordings.‘Experimentalsubjects’referto studentwhocompleted thetests designedtomeasurefunctionalandperceptualdistances.Theproceduresusedtoselectboththeresearchassistants andtheexperimentalsubjectsarepresentedasfollows.

Fig.2. Guragelanguagearea,fromMeyer(2014).ThedarkcolorrepresentsZwaylakearoundtheGuragearea.Oromo,Alaaba,Hadiyyaand K’abeenaareCushiticlanguagesspokenaroundtheGuragearea.YemsaisanOmoticlanguage.GuraisspokeninChahaareawhileWolaneis spokeninSilt’earea,andthegeographicalboundaryofGuraandWolanehasnotbeenidentifiedyet.

(7)

3.1.1. Researchassistants

Researchassistantsrefertoselectedsecondaryschoolteachers(minimumbachelor degreeholders).Theywere selectedfrom10schoolsinninedistrictsintheGurageandSilt’eareas:eightdistrictsinGurageZoneandonedistrictin Silt’eZone(ChahaandGuraarespokeninChahadistrict).Fromeachschool,threeteacherswhospokethevarietyofthat particularareaasanativelanguage,wereselected.Inotherwords,atotalofthirtyteacherswererecruitedfromthe10 schoolsinthe10districts.Theteacherswereselectedusingtwoscreeningsteps.Forthe initialscreening,acallfor participation in the form of printed leaflets, was distributed in the schools. The leaflets explained a few language requirements,suchasbeingnativespeakersofthelocalvarietyandhavinglifelongresidenceinthelanguagearea.There weremanyschoolsinsomeofthedistricts.ExceptforMesqanandGura,aschoolintheadministrativetownofeach districtwasselected.RegardingMesqan,theadministrativetownisButajira.SincetheresidentsofButajiraarelargely AmharicspeakersandMesqanisnotfrequentlyused,aschooloutsidetheButajiratownwasselected.Guraisspokenin Chahadistrict.RegardingGura,speakersfromaroundGuraMegenase(asuburbanareaofEndebir,atowninChaha) wereconsidered.

Ontheleafletthecontactinformationoftheprincipalinvestigatorwasincludedsothatanyinterestedteachercould easilygetintouchwiththeresearcheriftherequirementswerefulfilled.Thecallforparticipationwaspostedonthenotice boardsof all the secondaryschools in the districts of interest. Among the teachers whoresponded to the call for participation,threewereselectedfromeachlanguagearea.Thissecondscreeningwasconductedusingsemi-structured interviews.Theinterviewsfocusedonissuessuchastheteachers’homelanguagesituation,theamountofexposureto theneighboringvarieties,andlanguageconditionsinearlierworkplaces(whethertheyregularlyusednativelanguagein theworkplace).Basedontheseparameters,teacherswhowerenativespeakersofthelocalvariety,andwhousedthe languagebothinschoolsandathome,wererecruited.Theinterviewstookplaceintheschoolsoftherespectiveteachers. Theyreceivedasmallamountofpayment(300birr)fortheirservices.

3.1.2. Experimentalsubjects

Thesubjectswereselectedbytheresearchassistants.Thirtystudentswererecruitedfromeachschool,300intotal. Thestudentsinallthegradelevelsinthesecondaryschools(fromgrade9to12)wereconsideredinordertoincludeas manystudentsaspossible.Similartotheselectionoftheresearchassistants,thestudentswereselectedinatwo-step screening process. First, all students who are native speakers of the local variety were requested to complete a registrationformpreparedforthispurpose.Theregistrationwascompletedbytheresearchassistants.Oncethenative speakersofalocalvarietywereidentified,theyweresubmittedtothesecondscreening.Questionnaireswereemployed forthesecondscreening.Thequestionnairescontaineditemsaboutthestudents’firstandsecondlanguagebackground, familylanguageconditions,personalinformationandtheircontactwithspeakersofotherneighboringlanguagevarieties. ThequestionnaireswerepreparedinAmharicbecauseallsecondaryschoolstudentsinthestudyareaswereabletoread andwriteAmharic.Indeed,Amharicisboththelanguageofschoolingandofworkplacesinthestudyareas,exceptinSilt’e ZonewhereSilt’eistaughtinelementaryschools.Thequestionnaireswerecodedforeachschoolandforeachstudyarea sothattheycouldbeeasilyidentifiedduringtheanalysis.Alltheitems(questions)inthequestionnaireswereclose-ended tomaximizetheaccuracyoftheresponsesandtotakeintoaccounttheageandeducationlevelsofthestudents.

Then,basedontheinformationobtainedthroughthequestionnaires,300participants(30fromeachvariety),whoare thenativespeakersofthevarietiesofinterest,wereselected.Furthermore,basedonthedatathatwereobtainedfromthe questionnaire,itwasassuredthattheparticipantshadlivedtheirwholelifeintheareawheretheirvarietyisspokenand thattheirparents arenativespeakersofthe varietyunderinvestigation.Wheneverthe eligiblestudentsthat metthe requirementexceeded30for eachvariety, theequalproportionof sex(15malesand 15females)wasused asan additionalparameter.Wheneverthereweretoomanyeligiblecandidates,15maleand15femalestudentswererandomly selected.Priortodatacollection,permissionwasobtainedfrombothGurageandSilt’eCulturalandTourismBureaus,and from the administration of each school. Not all the selected participants attended the tests. Given that the Word Categorizationandperceptiontestswereadministeredatdifferenttimes,insomeofthelanguagesites,thenumberof participantswhocompletedtheWordCategorizationtestandtheperceptiontestwasnotexactlythesame.Intotal,285 participantscompletedtheWordCategorizationtest.Amongthese,171weremaleand114werefemale.Moreover,289 participantstookpartintheperceptiontest,amongwhich171weremaleand118werefemale.

3.2. Determiningstructuraldistance

Structuraldistancewasmeasured from twoperspectives:lexicaland phonetic. Wordsforthe structuraldistance measurewererandomlycollectedfromdifferentsources:fromalistofwordsgatheredfortheWordCategorizationtest, fromafableentitled‘TheNorthwindandtheSun’(allthewordsinthestorywereincluded),aswellasotherpublished materials.Hence,atotalof240wordswerecomparedtodeterminethetwodistances(seeAppendixB.1).Theprincipal

(8)

investigatororallypresentedthelistofwordstotheresearchassistantsfromeachlanguagearea,andaskedthemto provide(orally)equivalentsoftheirnativelanguageforeachofthewords.Then,theprincipalinvestigatorphonetically transcribedtheequivalentsprovidedbytheassistants.Incaseofdisagreementsduringtranslation,theassistantswere toldtoresolvebymajorityrule(2/3).

3.2.1. Lexicaldistance

Thelexicaldistancesamongthe10selectedlanguagevarietiesweredeterminedbycomputingthepercentageof non-cognatesofthetotallexicalitemswithinpairsofvarieties.Non-cognatesarewordsthatsharemeaning,buthavedifferent forms.Thecorpusofthelexicaldistancemeasurementisconstitutedbythewordsindicatedin3.2.Thesharedcognates weredeterminedbasedontwoparameters:similarityofrootsandsimilarityofmeaningbetweenthecorrespondingpairs ofwords.Theseparameterswereemployedinatwo-stepprocessofcognateidentification.First,theprincipalinvestigator identifiedpairsofwordsthatshareacommonrootbasedontheform(phonological)similaritybetweenthecorresponding words. Inalmost all Semiticlanguages, sequencesofconsonants form the basic wordmeaning(root). Hence, root similaritywasconsideredasacoreparameter,e.g.,Amharicb?re‘ox’,Endegagnbawra‘ox’,Chahabora‘ox’.Then,the similarityinmeaningamongthepairsofwordsthatsharethesamerootwasconfirmedbytheprincipalinvestigatorand theresearchassistants,whoarenativespeakersofthevarieties.Oncethecognateandnon-cognatewordsinpairsofall varietieswereidentified,thepercentageofnon-cognatewordswascomputed.

3.2.2. Phoneticdistance

Theoutputofthelexicaldistancemeasurementwasusedasaninputforthephoneticdistancemeasurement,thatis, phoneticdistancewasmeasuredonlybetweencognatesthatwerephoneticallytranscribed.Cognatesthataresharedat leastbysixofthe10languagevarietieswereconsideredforphonetic distance.Thecognateswerealigned,andthe distanceamongthemwascomputedusingthe Levenshteinalgorithm,based onthe numberofphoneswhich were inserted,deletedorsubstituted.Thedistancecomputationwasmadeusingthesimplestcostassignment.Thesimplest costassignmentassignsequalcost(1unit)toalltheoperations.Onlythedistanceamongthecognateswascomputed basedonKessler(1995,p.5)sincethedifferenceamongnon-cognatesisnotphonetic.TheLevenshteindistanceamong the cognateswascomputedusingGabmap(seeHeeringa,2004;Nerbonneet al.,2011).Thefollowingare sample Levenshtein (phonetic)distances betweenKistaneand Chahabased ona sharedcognate ‘cloud’. Inthiscase,the Levenshteindistanceis2;substitutionof[m]by[b]and[n]by[r].Thisoperationcoststwounits.Thisdistancevalueis dividedbythelongestalignment,6inthiscase,toobtainthenormalizeddistance.Thenormalizeddistancebetween KistaneandChahainthisparticularexampleis.33(2/6)(Table1).

3.3. Functionalandperceptualdistances

Thissectionpresentstestsdesignedtomeasurefunctionalandperceptualdistancesamongthe10languagevarieties.

3.3.1. Functionaldistance

TheWordCategorizationtestwasadoptedfromTangandvanHeuven(2009).Thistestwasselectedsinceitcouldbe administeredwithminimalprimingorlearningeffect,themajorfactorthatprobablyinfluencedpreviousstudiesbyGutt (1980),Ahland(2003)andMenuta(2015).

Materials:ThematerialselectionandpreparationprocedureswerequitesimilartothoseofTangandvanHeuven (2009).Thefirststepinthematerialpreparationwasdetermining10semanticcategoriestobeusedforthetest.The semanticcategoriesweregeneralconceptssuchasplants,fruits,animals,furniture,etc.(seeAppendixB.2).Oneofthe parameterswasthefrequencyofuseofthesemanticcategoriesamongthespeakersofallvarieties.Forinstance,some

Table1

Phoneticdistance,usingLevenshteinAlgorithm. Kistane--Chaha‘cloud’

d a m ? n a

d a b ? r a

1 1

Absolute 2

(9)

categories,suchasmusicalinstruments,areextremelyculture-specific;asaresult,theymightnotbecommonamongall speakers.Thesecondparameterwasthepossibilityofasemanticcategoryincorporatingasmanywordsaspossible. Thisparameter was importantbecause eachsemantic category mustcontainat least 10words. First, the principal investigatorselectedthecategoriesbasedonhisintuition.Thecategorieswerelaterapprovedbytheresearchassistants. Similarparameterswereusedtodeterminewordstobeincludedundereachsemanticcategory.Inthiscase,word frequencywascomputedsincefrequencycouldbeoneofthefactorsthatdeterminethecomprehensionofthewords.It wasnotpossibletocomputedirectlythefrequencyofthelexicalitemstobecategorizedundereachsemanticcategory. ThiswasbecausenoneoftheEthiosemiticvarietiesunderinvestigationhasitsownstructuredcorpus.Alsomanyofthem donothaveonlineoralandwrittendocumentswhichcouldbeusedasinputtocreateacorpus.Theonlylanguageinthe areawithasufficientamountofeasilyavailablelanguagedataisAmharic.Hence,anAmhariclanguagecorpuscontaining about100,000writtenwords,wascreatedusingAntConcsoftware(Anthony,2004),andthiscorpuswasusedtoestimate the frequencyofeach lexicalitem.All the sourcesof thedata werewritten textssuch as newspapers,magazines, academicarticlesandsocialmediatexts.Inthecorpus,textsofdifferentgenres(politics,economics,agriculture,culture, sport,science,etc.)wereincludedtomakethecorpusasrepresentativeaspossible.Usingthiscorpus,wordsthathavea relativelyhighfrequencywereselected.

Usingtheseprocedures,10semanticcategories,eachcontainingthe10mostfrequentwords,wereidentified(see

AppendixB.2).Aftertheidentificationofthewordsandthesemanticcategories,thewordsundereachsemanticcategory weretranslated by the reserachassistants from Amharicto the 10varieties. Thetranslators weretold to solve the disagreements by majority vote(2/3) whenever there wasa disagreementamong them. Afterthe translation, each translatorpronouncedthetranslatedwords,100wordsforeachvariety,forsoundrecordingwithAdobeAuditionrunning onapersonalcomputer.Then,thethreetranslatorsfromeachvarietywereaskedtoratetheirthreerecordingsof100 wordsonaLikertscalethatrangedfrom0(notnatural)to5(natural).Finally,amongthethreerecordings,theonewiththe highestratingscorewasselectedfortheintelligibilitytest.

Procedure:IntheWordCategorizationtest,theparticipants’recognitioncapabilitywastestedthrough semantic-multiplechoicecategorization.Inthetest,thelistenersindicatedwhichofthe10givensemanticcategoriesaspoken wordbelongsto.Forinstance,therespondentsheard‘banana’andwereaskedtocategorizethiswordunderoneofthe 10semanticcategories(‘fruits’inthiscase).Theassumptionherewasthatthecorrectcategorizationisachievedonlyif thelistenerscorrectlyrecognizethetargetwords.Astherewere10semanticcategoriesforeachword,theprobabilityof categorizingthewordsbychanceisverysmall(10%).Intheprocessofdevelopingthistest,theprimaryactivitywas creatingaudioinputsothatthelistenersdonothearthesamewordinthesamevarietymorethanonce.Inotherwords, theprimingeffectduetotherepetitionofsimilarinputshouldbecontrolledfor.SimilartoTangandvanHeuven(2009), the Latin Square system was used for this purpose. Different data files (CDs) were created using the following procedures.

Asindicatedabove,intheWordCategorizationtest,listenersmustnothearthesamewordmorethanonce.Aword whichisheardtwiceormorehasmorepossibilityofbeingrecognizedthanawordwhichisheardonlyonce--thepriming effect.Inthepresentstudy,therewere10semanticcategories,eachsemanticcategoryconsistedof10lexicalitems,a totalof100(1010)words.Basedonthesewords,differentCDswerecreated.OnthefirstCD,theselectedlistof100 wordswerepresentedinafixedrandomorder(1--100)insuchawaythateverysubsequentwordisspokenindifferent variety.Thisisadefaultorder.OnthesecondCD(CD2),thewordswerepresentedinthesameorderexceptthatthe presentationbeginswiththevarietyinwhichno.100wasspoken,thenfollowedbyvarietiesinwhichno.1tono.99were spoken.Duetothisshift,everywordinCD2wasspokeninadifferentvarietyascomparedtoCD1.ThethirdCDbegins withthevarietyinwhichno.99wasspokenfollowedbythevarietyinwhichno.100wasspoken,followedbyvarietiesin whichno.1tono.98werespoken.Throughthisrotation,atotalof10CDswerecreated,eachCDcontaining100wordsin 10semanticcategories.

OneCDwasadministeredforparticipantsfromeachlanguagearea(seeFig.3).The100wordsonaCDweredivided into10tracksandeachtrackwaspresentedtoagroupconsistingofthreeparticipants(everytrackwasrepeatedthree times)sothateachmemberofthegroupclassifiedthe10samewordsinto10semanticcategories.Sincetherewere10 tracksoneachCD, a totalof 30students listenedto eachof the CDsadministered ineach languagearea. These proceduresmeantthat:(1)eachlistenerexperiencedeachwordonlyonce.(2)Alistenerfromeverylanguageareaheard eachwordin10differentvarieties;and(3)Everymemberofagroupheardone-tenth(1/10)ofthetotallexicalitems.Fig.3

belowshowstheprocedureofthetask.TangandvanHeuven(2009)used7sastheresponsetime.Inthepresentstudy, thetimewasincreasedto10sinordernottoputthestudentsundertimepressure.Beforetheactualtesting,therewasa practicesession.Forthissession,aseparatepracticeCDcontaining10wordsand10semanticcategoriesfromadditional materialwasprepared.Eachparticipantpracticedatleastoncebeforebeginningtheactualtask.Morethanonepractice sessionwasalloweddependingontheconfidenceandinterestofaparticipant.

ForeachtrackoftheCDs,therewasananswersheet.EachanswersheethaditsownCDandtracknumbers(e.g.,CD 1,Track2)sothateachparticipantreceivedananswersheetwithadifferentcodenumber.TangandvanHeuven(2009)

(10)

providedthelistof10semanticcategoriesontheresponsesheet.Thesamemethodwasusedinthepresentstudy.After listening to the orallypresented words, the participants responded by choosingthe appropriate match from lists of categoriesprovidedontheresponsesheet.Thetestwasadministeredinquietclassroomsintheselectedschools.Each participantwastestedindividuallyinaseparatesession.Thetestwasadministeredbytheprincipalinvestigatorandone oftheresearchassistants.Theintelligibilitymeasurewasthepercentageofwordscorrectlymatchedwiththesemantic categoriesprovided.

3.4. Perceptualdistanceandattitudetests

Thissectionpresentsprocedureswhichwereemployedtodetermineperceptualdistanceandtheattitudesofthe speakerstowardsthetestlanguages.Perceptualdistancewasmeasuredfromtwoperspectives:perceivedsimilarityand perceivedintelligibility.Thepresentationbeginswiththematerialsusedforpreparingthetests.

3.4.1. Thematerials

Thefable‘TheNorthWindandtheSun’wasusedasinputtodeterminethe perceivedintelligibility,theperceived similarityandtheattitudeofthespeakerstowardseachother’svariety.First,thefablewastranslatedfromEnglishtoeach ofthelocalvarietiesbythethreeresearchassistantsrecruitedfromeachlanguagearea.Incaseofdisagreements,the assistantsweretoldtoresolveusingmajorityrule(2/3).AmodifiedAmharicwritingsystemwasusedforthetranslation. Afterthewrittentranslation,thetranslatedversionofeachvarietywasorallypresentedbyeachofthethreeresearch assistants.ThepresentationofeachtranslatorwasrecordedusingAdobeAuditionrunning ona personalcomputer. Then,thethreetranslatorslistenedtoeachrecordingandratedthereadingsonaLikertscalethatrangedfrom1(not natural)to5(natural).Finally,amongthethreereadings,theonewhichreceivedthehighestratingscorewasselectedfor thetest.Therecordingwasmadeinasilentroomineachschool.Therecordingprocesswasadministeredbytheprincipal investigator.

(11)

3.4.2. Thetestsandtestprocedures

Thethreetypesoftests:perceivedintelligibility,perceivedsimilarityandtheattitudeofthespeakerswerecombined andadministeredatthesametimeusingthesamematerial.Eachtestwasrepresentedbyoneitem(question)withitsown ratingscales.This means thatthe combinedtest contains threequestions:one for perceivedsimilarity; another for perceivedintelligibilityandtheremainingoneforlanguageattitude.Thethreetestitemswerepresentedsimultaneouslyto minimizetheeffectoftheparticipants’familiaritywiththetestmaterial,thatis,thetest-takersansweredthethreequestions afterlisteningtoeachversionoftherecordings.

Inordertominimizearesponsebiasthatmightoccurduetofatigueandfamiliaritywiththetest,thetestitemswere arrangedinthreedifferentorders;orderA:(1)attitudetestitem,(2)perceivedintelligibilitytestitem,(3)perceivedsimilarity testitem;orderB:(1)perceivedintelligibilitytestitem,(2)perceivedsimilaritytestitem,(3)attitudetestitem;andorderC: (1)perceivedsimilaritytestitem,(2)attitudetestitem,(3)perceivedintelligibilitytestitem.Duetothesearrangements, eachtestitemappearedinthreedifferentorders.Beforethetestadministration,the30speakersofeachvarietywere randomlydividedintothreegroups,eachgroupcontaining10members.Then,thetestswereadministeredinsuchaway thatmembersofthesamegroupreceivedtestsinthesameorder:thefirstgroupreceivedorderA,thesecondgrouporder BandthethirdgrouporderC.Administeringtestsofthesameorderformembersofthesamegroupwasimportanttogive thesameinstructionforallgroupmembers.Theaudioinputswerepresentedusingaloudspeakersothatitwouldbe possibleforustofolloweachresponse.

During the test, the test-takers listenedto the recording of each variety and responded tothe three successive questions.Theyrespondedbyputtingan‘X’ontheLikertscaleprovidedforeachquestion.Tomeasuretheperceived intelligibility,theparticipantswereaskedtodeterminetowhatextenttheyunderstoodthespeakerintherecordings.After listeningtoeachoftherecordings,thetest-takersindicatedtheirjudgmentontheLikertscalesthatrangedfrom0(‘donot understandatall’)to10(‘completelyunderstand’).Inthesamemanner,forperceivedsimilarity,therespondentswere askedtodeterminetowhatextenteachofthepresentedrecordingswassimilartotheirownvarietyandtoexpresstheir judgmentusing11-pointLikertscalesthatrangedfrom0 (‘notsimilar’)to10(‘completelysimilar’).Withregardtothe languageattitude,therespondentswereinstructedtodeterminewhetherthelanguageinwhichthestorywaspresented wasbeautifulornot,andtoprovidetheirresponseson10-pointLikertscalesthatrangedfrom1(‘notbeautiful’)to10 (‘beautiful’).Therecordingsofthe10languagevarietieswerepresentedindifferentordersforthespeakersofeachvariety tomanagetheimpactoffatigue(respondentsmightbelessconscientiousonthelaststory).Inotherwords,therewere10 differentordersoftherecordings,oneorderforthespeakersofeachlanguagevariety.

Afterthepresentationofeachrecording,therewasaresponsetimeof3minutes,1minuteforeachtestitem.Forthe sakeofuniformity,theinstructionwasgiveninAmhariceitherbytheprincipalinvestigatororbyoneoftheresearch assistants.Iftherewasamisunderstanding,furtherexplanationwasprovidedintheparticipants’nativelanguage.The recordingswerepresentedusingapersonalcomputerattachedtoaloudspeaker.Afterlisteningtoeachrecording,the listenersprovidedtheirresponsesbymarking‘X’onthescaleprovided.Foreachrecording,therewasaseparateanswer sheet.Inotherwords,eachtest-takerreceived10pagesofresponsesheets,onepageforeachrecording.Thisprocedure wasvitaltomakesurethatthe test-takerspreciselymatchedeach recordingwiththe respectivetestitems. The30 selectedstudentstookpartintheperceptualtestsaftertheyhadtakenpartintheintelligibilitytest.

3.5. Clusteringandclustervalidation

Afterdatacollection,Gabmapwasemployedfortheclusteringandclustervalidation.Gabmapisdialectclassification andvisualizationsoftwaredevelopedbylinguistsattheUniversityofGroningen(seeLeinonenetal.,2016;Nerbonne etal.,2011;Snoek,2014).It provides severalstatisticalalternativesto groupsimilarlanguagestogether. Basedon

GooskensandHeeringa(2004,p.196),theWeightedaveragemethodwasemployedtoclassifytheGuragevarieties.

However,clusteringisoftentricky--asmall variationinthe datamatrixcouldresult in differentgroupings.Gabmap providesthreeclusteringvalidationtechniques--discreteclustering,fuzzyclusteringandmultidimensionalscaling.Inthe presentstudy,multidimensionalscalingwasusedtomakesurethattheclusterscreatedwerevalidandreliable(see

Nerbonne et al., 2011). The results of fuzzy clustering are presented in the appendix (see Appendix C.1).

Multidimensionalscaling takesa distancematrix as aninput andgroups valuesthat aresimilar. Gabmap provides multidimensionalscalinginatwo-dimensionalspace.Thefirstdimensionexplainsmuchofthevarianceinthedistance matrix.Theseconddimensionexplainsalargeportionoftheremainingvariances.

4. Results

Various distance matrices were obtained from the structural, functional and perceptual distance measures. In Section4.1,wereportresultsoftheclassificationsofthelanguagevarietiesbasedonstructural,functionalandperceptual

(12)

distances. As indicated in Section 3, structuraldistance wasmeasured using the phonetic and lexical differences. Functional distancewasdetermined based onthe respondents’scores onthe WordCategorizationtest,perceptual distancewasestimatedbasedontherespondents’responsetotheself-ratingperceptiontest.Theaverageoftheupper andthelowerhalvesofthedistancematrixwasconsideredasthedistancebetweenlanguagesinboththefunctionaland perceptual measures. Section4.2 presents the results ofthe relationship among the three dimensions ofdistance. Section4.3presentstheresultsoftheWordCategorizationtest.

4.1. ClassificationsoftheSouthEthiosemiticlanguages

InthissectionwepresenttheclassificationsoftheSouthEthiosemiticlanguagesbasedonthemeasuresofthethree dimensionsofdistance.Theclassificationresultsaresupplementedbytheresultsofmultidimensionalscaling.

4.1.1. Classificationofthelanguagesbasedonstructuraldistance

Fig.4(a)showsthemultidimensionalscalingplotofphoneticdistanceintwo-dimensionalspace.Thefirstdimensionis indicatedbyasolidarrowandtheseconddimensionbyadashedarrow.InFig.4(a),thefirstdimensionshowsthatChaha, Gura,GumerandEzhahavelowvalueswhileKistaneandSilt’ehavethehighestvalues.Thevaluesofotherlanguages arebetweenthesetwoextremes.Thisdimensionexplains52%(r=.72)ofthevarianceinthedistancematrix.Thesecond dimension(dashedarrow) indicatesthatEndegagnhasthelowestvaluewhileMesqanand Muherhavethe highest values.Thevaluesofothervarietiesarebetweenthesetwoextremes.Thisdimensionalsoexplains38%ofthevariance (r=.62). Thetwodimensions combinedexplain90%of thevariance inthe matrix.Basedonphonetic distance,the multidimensionalscalingplotindicatessixgroupsoflanguagevarieties:{Chaha,Gura,Gumer,Ezha},{Muher,Mesqan}, {Endegagn},{Inor},{Silt’e}and{Kistane}.AscanbeseenfromFig.4(a),Silt’eandKistaneareseparatelanguages.Inor and Endegagn are also phonetically somehow different. Fig. 4(b) shows that map of the first dimension of the

(13)

multidimensionalscalingforthephoneticdistance.Thelightcolorshowstheareawiththehighestlinguisticdistance, whichisthe Silt’earea.

ThemultidimensionalscalingplotbasedonlexicaldistanceisillustratedinFig.4(c).Thefirstdimension(solidarrow) explainsthemajorityofthevariance,96%(r=.98).Asthefigureillustrates,Gura,Gumer,Chaha,Ezhahavethelowest values,andSilt’ehasthehighestvalue.Thevaluesoftheothervarietiesaresomewherebetweenthesetwoextremes. Theseconddimension(dashedline)showsthatInorhasthelowestvalueswhileMuherandMesqanhavethehighest values.Thisdimensionexplains2%(r=.15)ofthevarianceinthedatamatrix.Thetwodimensionscombinedexplain 98%ofthevariances.Themultidimensionalscalingplotoflexicaldistanceshowsfivepossiblegroupingsofthelanguage varieties:{Gumer,Gura,Ezha,Chaha}formagroup.{InorandEndegagn}alsoformagroup.Inthesamemanner,{Muher andMesqan}formagroup.However,{Kistane}and{Silt’e}areseparatelanguages.Fig.4(d)illustratesthemapofthefirst dimensionofmultidimensionalscalingforthe lexicaldistance.Thelightcolorshowsareaswiththehighestlinguistic distancewhichisagaintheStilt’earea.

ThedendrogramsobtainedfromthedistancesarepresentedinFig.5(a)and(c).Thetwodendrogramsillustratethe classificationofthelanguagevarietiesbasedonthephoneticandlexicaldistances,respectively.Fig.5(b)and(d)illustrate thedialectmapsofthelanguagevarietiesbasedonphoneticandlexicaldistancesrespectively.Ascanbeseenfrom

Fig.5(a),basedonphoneticparameter,{Gura,Gumer,Ezha,Chaha}formagroup.{MuherandMesqan}areclosely related. However, {Kistane} and {Silt’e} are separate languages. Likewise, {Endegagn} and {Inor} are separate languages.Fig.5(b)alsoshowsthegeographicaldistributionofthesixdialectareas.Ingeneral,thephoneticdistance measureshowsthattheSouthEthiosemiticlanguagesareclassifiedintosixdialectareas.

Fig.5(c)presentsthedendrogramofthelanguagevarietiesbasedonlexicaldistance.Ascanbeseenfromthefigure, fromalexicalpointofview,{Gura,Gumer,ChahaandEzha}formagroup.{EndegagnandInor}alsoformagroup.

(14)

{MesqanandMuher}formanothergroup.{Kistane}and{Silt’e}areseparatelanguages.Fig.5(d)presentsthedialectmap ofthelanguagevarieties,basedonlexicaldistance.Unlikephoneticdistance,therearefivedistinctgroupsoflanguages. Clearly,thephoneticandlexicalclassificationsaredifferent.Forexample,EndegagnandInorformagroupinthelexical classification,butnotinthephoneticclassification.KistaneandSilt’earedifferentinbothclassifications.

4.1.2. Classificationsbasedonfunctionaldistance

ThefunctionaldistanceresultswereobtainedfromtheWordCategorizationtest.SincetheWordCategorizationtest measuresthesimilarity,notthedifference,amongthelanguagevarieties,theaverageoftheparticipants’scoresonthe test was subtracted from 100 to obtain functionaldistance. Fig.6(a) presentsa plot ofmultidimensional scaling of functionaldistanceintwo-dimensionalspaces.Thefirstdimension(solidarrow)showsthatSilt’ehasthehighestvalue whereas Gumer, Chaha, Ezha and Gura have the lowest values. Muher and Mesqan have medium values. This dimensionexplains79%(r=.89)ofthe variancein thedistancematrix.TheseconddimensionshowsthatInorand Endegagnhavethehighestvalues,while Muherand Mesqanhavethelowestvalues.Thisdimensionexplains14% (r=.37)ofthevarianceinthedistancematrix.Thetwodimensionstogetherexplain93%ofthevarianceinthedistance matrix.Fig.6(b)illustratesthemapofthefirstdimensionofthemultidimensionalscaling.Thelightestcolorindicatesarea

(15)

withthehighestdistancevaluewhichistheSilt’earea.Thepatterninthemultidimensionalscalingplotshowsthatthere areroughlyfivegroupsofthelanguagevarieties--{Gumer,Chaha,EzhaandGura}formonegroup,{Muher,andMesqan} anothergroup,and{InorandEndegagn}alsoformanothergroup.{Silt’e}and{Kistane}areseparatelanguages.Fig.6(c) and6(d)presentthedendrogramsofthelanguagevarietiesbasedonfunctionaldistance,andthecorrespondingdialect map.AscanbeseenfromFig.6(c),{Gumer,Gura,ChahaandEzha}formagroup.{MuherandMesqan}formanother group.Moreover,{Endegagn,Inor}arecloselyrelated.{Sil’te}and{Kistane}areseparatelanguages.Fig.6(d)alsoshows fivelanguageareas,withSilt’eandKistaneformingtheirowndistinctdialectarea.

4.1.3. Classificationsbasedonperceptualdistance

In Section3 itwas indicated that two perceptual distancemeasures, that is, perceived similarityand perceived intelligibility,wereemployedtodetermineperceptualdistanceamongthelanguagevarieties.Thepercentageofthemean ofthetwomeasureswascomputedandsubtractedfrom100toquantifyperceptualdistanceamongthevarieties.Itis importanttorememberthattheperceptualtestmeasuresthesimilarityamongthelanguagevarieties,notthedifference, andthisiswhythesubtractionwasneeded.Theclusteranalysiswasperformedontheaverageoftheupperandthelower halvesoftheperceptualdistancematrix.

Fig.7(a)showsthemultidimensionalscalingplotofperceptualdistance.Asthefigureillustrates,inthefirstdimension, Ezha,Gumer,GuraandChahahavethelowestvalueswhileKistaneandSilt’ehavethehighestvalues.Thisdimension

(16)

explains76%(r=.87)ofthevarianceinthedistancematrix.Theseconddimension(dashedarrow)showsthatInorhas thehighestvaluewhileMesqanandMuherhavethelowestvalues.Thisdimensionexplains7%(r=.27)ofthevariance. Theremainingvaluesarebetweenthesetwoextremes.Bothdimensionscombinedexplain83%ofthevarianceinthe distance matrix.Themultidimensional scalingresults clearly show thatthere are fourgroupsof languagevarieties: {Chaha,Gura,GumerandEzha},{MesqanandMuher},{EndegagnandInor}and{KistaneandSilt’e}.Fromaperceptual pointofview,KistaneiscloselyrelatedtoSilt’e.Fig.7(b)showthemapofthefirstdimensionofthemultidimensional scaling.ThelightcolorshowstheareathathasthehighestlinguisticdistancewhichistheSilt’earea.Fig.7(c)and(d) showtheclassificationofthelanguagesbasedonperceptualdistance,andthedialectmapofthe10languagevarieties respectively.Fig.7(c)showsthat{Chaha,Gumer,GuraandEzha}formagroup.{InorandEndegagn}formagroup.There isalsoastrongaffinitybetweenMuherandMesqan.Inadifferentmannerfromtheclassificationsbasedonstructuraland functionaldistances,KistaneandSilt’eformagroupintheclassificationbasedonperceptualdistance.Fig.7(d)showsthe dialectmapoftheSouthEthiosemiticlanguagesbasedontheperceptualmeasure.

4.1.4. ThecombinedclassificationofEthiosemticlanguages

Aspresented in the precedingsections,theclassifications that wereobtained fromthe structural, functionaland perceptual distancemeasuresare not identical. Theclassification basedon phonetic distanceshowssix groups of languages,whiletheclassificationbasedonlexicaldistanceindicatesfivegroups.Hence,thissectionaimstocombine theseclassificationsandprovideacomprehensiveclassificationofthelanguages.Theresultsofthecomparisonbetween the combined classification and the classifications by the historical linguists will then be presented. Fig. 8(a)-(d) summarizestheclassificationspresentedintable4.1.1to4.1.3.Fig.8(e)presentsthecombinedclassificationwhichwas derivedfromthecomparisonsofallotherclassifications.TheSigmasymbolinthecombinedclassificationrepresentsan unspecifiedmotherlanguage.

Giventhatthelinguisticdistancewasmeasuredfromthreeperspectives(structural,functionalandperceptual),the distancematriceswererankedbasedontheirreliability,andthemostreliabledistancemeasureswereprioritizedinthe processof combiningthe classifications presented above.Gabmap provides twomeasuresofreliabilityof distance matrices:localincoherenceandCronbach’salpha.Localincoherenceisanumericalscoreoflocalstressthatisassigned toasetofdifferencesbetweenitems(ameasureoflinguisticdistancesinthepresentstudy).Theoptimalscoreiszero whilethenon-optimalscoresareanypositivevalue.Comparingthevalueoflocalincoherencefordifferentmeasurements overthesamedataindicateswhichresultismorereliable(Nerbonneetal.,2011).Lowervaluesoflocalincoherence meanthattheresultsarebetter.Theideabehindlocalincoherenceisthat,onaverage,thelocationsthatarecloseshould belessdifferentthanlocationsthatarefurtherapart.

Cronbach’s alphaisa coefficientof reliability. It is used tomeasure the internal consistencyor reliabilityof the psychometrictestscores.InGabmap,itisusedasthecoefficientofreliabilityofthemeasurementofdifferencesoverthe data.High(>.70)Cronbach’salphavaluesmeanthatthereisahighlevelofconsistencyamongthemeasureofdistances.

Table2showstheresultsoflocalincoherenceandCronbach’salphaforeachofthedistancematrices:phonetic,lexical, functionalandperceptual.

Table 2 shows that phonetic distance has the highest Cronbach’s alpha value, and the lowest value of local

incoherence.Thismeansthatitisthemostreliablemeasurecomparedtoallotherdistancemeasures.Lexicaldistance has lower local incoherence and a higher Cronbach’s alpha compared to the functional and perceptual distance measures.Comparedtoperceptualdistance,functionaldistancehasahighCronbach’salphaandalowvalueofthelocal incoherence.PerceptualdistancehasthelowestCronbach’salphaandthehighestlocalincoherence,whichmeansthatit hasverylowreliability.Ingeneral,Table2showsthatstructuraldistance(bothphoneticandlexicalmeasures)presents themostreliabledistancemeasures.Functionaldistanceismorereliablethanperceptualdistance.Perceptualdistanceis theleastreliabledistancemeasure.

Giventhesedifferences inreliability, structuraldistancewasemployedasa primaryparameter in the processof determiningthecombinedclassification,thatis,ifasetoflanguagevarietiesformagroupinbothphoneticandlexical classifications, that setof languages wasautomatically considered for the combined classification. However, when languagesbelongtodifferentgroupsinthephoneticandinthelexicalclassification,functionaldistancewasconsidered asasecondparametertodeterminewhichgroupisthemostplausibleone.Perceptualdistancewasconsideredasathird parameterwhenasetoflanguagevarietiesformeddifferentgroupsintheclassificationsbasedonboththestructuraland functionaldistances.

InFig.8(a)-(d),{Chaha,Gura,EzhaandGumer}formagroupnotonlyintheclassificationbasedonphoneticdistance, butalsointheclassificationbasedonlexicaldistance.Therefore,thisgroupwasautomaticallyincludedinthecombined classification withouteven considering their classificationbased onfunctional and perceptual measures. {Inor}and {Endegagn}areseparatelanguagesinthe classificationbasedonphoneticdistance,but theyareverysimilarinthe classificationbasedonlexicaldistance.Therefore,functionaldistancewasusedasasecondparameter.Basedonthese

(17)

Fig.8. ComparisonsoftheclassificationsofSouthEthiosemiticlanguages.

Table2

Consistencywithinthedistancematrices.

Localincoherence a_Cronbach_’s_alpha

Structural Phonetic .22 .97

Lexical .23 .87

Functional .29 .63

Perceptual .32 .61

a_The_high_Cronbach_’s_alpha_of_the_phonetic_distance_could_be_due_to_the_high_sample_size._Nonetheless,_the_higher_degree_of_Cronbach_’s_alpha oftheremainingtwomeasures(lexicalandfunctional)clearlyshowsthatperceptualdistancehaslowreliability.Itisalsoimportanttoremember thatthereliabilitymeasuresforthefunctionalandperceptualdistancesisbasedonthemeanoftheupperandthelowerhalvesoftherespective distancematrix.

(18)

criteria,InorandEndegagnweregroupedtogetherinthecombinedclassification.{MesqanandMuher}formagroupin theclassificationsbasedonbothphoneticandlexicalmeasures.Hence,theyautomaticallyqualifiedforthecombined classification. {Silt’e}and {Kistane} are separate languages in the classificationbased onthe phonetic and lexical parameters.Theyarealsoseparatelanguagesintheclassificationbasedonfunctionaldistance.Therefore,theywere consideredas independentlanguagesin thecombinedclassificationalthoughthey formagroupin theclassification basedonperceptualdistance.Thiswasduetothefactthatperceptualdistancehasverylowreliability.Basedonthese requirements,the10SouthEthiosemiticlanguagevarietieswereclassifiedintofivegroups--thefirstgroupconsistsof {Chaha, Gura, Gumer, Ezha};the secondgroup contains{Inor, Endegagn}, the thirdgroup comprisesof {Mesqan, Muher};thefourthgroupincludesonly{Kistane},thefifthgroupconsistsof{Silt’e}.

AscanbeseenfromFig.8(a)--(c),thegroupingofthefourCentralWestGuragelanguages--Chaha,Gura,Gumerand Ezhaisconsistentacrossalltheclassificationparameters.Therefore,thefourCentralWestGuragelanguageswereused asapointofreferencetodeterminetherelativepositionsofothergroupsoflanguagesinthecombinedclassification. {MuherandMesqan}arecloserto{Chaha,Gura,GumerandEzha}than{Kistane}intheclassificationbasedonlexical distances.Thisisnotthecasefortheclassificationbasedonphoneticdistancesince{Kistane}israthercloseto{Chaha, Gura, GumerandEzha}.Inthiscase,functionaldistancecannot beusedasa secondparameter sinceMuherand Mesqandonotformagroupintheclassificationbasedonfunctionaldistance.Hence,perceptualdistancewasusedasa thirdparametertomove{MuherandMesqan}closertothefourCentralWestGuragelanguages.{Inor,Endegagn}are closertotheCentralWestGuragelanguagesthan{Kistane}inlexical,functionalandperceptualclassifications;therefore, they maintainedtheirpositionin thecombinedclassification.Moreover,compared toSilt’e,{Kistane} isclosertothe CentralWestGuragelanguagesbasedonphonetic,lexical,functionalandperceptualparameters.Silt’eismostremote fromthe CentralWestGurgaelanguagesbasedonthree(lexical,functionalandperceptual) ofthefourclassification parameters.TheultimateresultofthisprocessisthecombinedclassificationpresentedinFig.8(e).

The remaining pointis determining to whatextent the combinedclassificationcorresponds to the classifications previously proposedby historicallinguists.Fig.9(a)--(c)showsthatthe combinedclassificationseemssimilar tothe classificationbyHetzron(1972).Forexample,inbothclassifications,Chaha,Gura,GumerandEzhaformagroup.Inor andEndegagnalsoformagroupinbothclassifications.However,unlikethecombinedclassification,MuherandMesqan donotformagroupintheclassificationbyHetzron(1972).Moreover,unliketheclassificationbyDemeke(2001),Muher andInordonotformagroupwiththeCentralWestGuragelanguageswhichare{Chaha,Gura,GumerandEzha}inthe combinedclassification.

Mereimpressionisticcomparisonsofthedendrograms,maynotpreciselyconveytowhatextenttheseclassifications aresimilar.Hence,itisimportantforourpurposesthattheclusteringproceduresresultinare-estimationofthedistances betweencollectionsites,theso-calledcopheneticdistance(Nerbonne,2010,p.483).Thecopheneticdistanceisthe distancebetweentwositesatthepointatwhichtheyarefusedintheclusteringprocess.Copheneticdistancesdistortthe originaldistancematrixbecauseofthestipulationthatthedistancebetweenthenewlyfusednodes,andallothers,bethe averageofthe distancesfromeachcomponentofthefusiontotheothers.Forexample,inFig.9(c),thecophenetic distancebetweenMuherandMesqanistwo:(1)fromMuheronenodeuptothemothernode,(2)fromthemothernode downtoMesqan(seealsoGooskensandHeuven,2018).Pearson’scorrelationcoefficientwasusedtoillustratethe relationship between the cophenetic distance of the combined classification presented in Fig. 9(e) and that of the classificationsbythehistoricallinguists.

Forthesakeofsimplicityandspace,onlythe10languagevarietiesunderinvestigationareincludedinFig.9among severalEthiosemiticlanguagespreviouslyclassifiedbythehistoricallinguists.Sincethedistancebetweenthenodesina familytreeissymmetrical(thedistancebetweennodeAandnodeBisequaltothedistancebetweennodeBandnodeA), thenumberofpairsofcopheneticdistancemeasuresisalwaysN(N 1)/2.Thismeansthatinthepresentstudy,there are10languagevarieties.Therefore,thepossiblesymmetricpairsoflanguagestowhichthecopheneticdistancehasto be computed is10(10 1)/2, which is 45. For the sake of space, only the correlation coefficients between the copheneticdistanceofthecombinedclassificationandthatoftheclassificationsbyDemeke(2001)andHetzron(1972)

arepresentedhere.TheanalysesoftherelationshipusingPearson’scorrelationshowthatthecopheneticdistanceofthe combined classificationcorrelates more stronglyto the copheneticdistance ofthe classification byHetzron (1972), r=.761ascomparedtothecorrelationbetweenthecopheneticdistanceofthecombinedclassificationandthatofthe classification by Demeke (2001), r=.553. The two correlation coefficients are statistically significantly different, Hotelling’st-test,t=6.845,p<.001.

4.2. Relationsamongthethreedimensionsofdistance

AsindicatedinSection1.1,examiningtherelationshipamongthethreedimensionsoflinguisticdistanceisoneofthe aims ofthepresent study.Inthissection,therefore,correlationsamongthe threedimensions oflinguisticdistances

(19)

reportedin theprecedingsectionsare presented.Table3 illustratesthe correlationcoefficientsofthe twostructural distances,functionaldistanceandperceptualdistance.Ascanbeseenfromthetablethereisaverystrongcorrelation betweenthetwostructuraldistances--phoneticdistanceandlexicaldistance.Furthermore,thecorrelationbetweenthe twostructuraldistancesandperceptualdistanceisverystrong.Comparedtoothercorrelationcoefficients,thecorrelation betweenfunctionaldistanceandperceptualdistanceissmall(althoughnotstatisticallysignificant).Thissuggeststhatthe participants’similarityjudgmentandtheiractualscoreontheintelligibilitytestmaynotbeexactlythesame.Ingeneral, therearestrongcorrelationsamongalmostallthedistancemeasurescomparedinTable3.Asaresult,inTable4,these correlationcoefficientsarecomparedtoeachothertodeterminewhethertherearestatisticallysignificantdifferences amongthem.

Fisher’s r to z transformation was employed to compare the correlation coefficients among the three distance measures:structural,functionalandperceptual.Table4illustratesthattherearenostatisticallysignificantdifferences amongthecorrelationcoefficientsofallthedistancemeasures.

(20)

4.3. IntelligibilityamongtheSouthEthiosemiticlanguages

AsindicatedinSection1,bothfunctionaldistanceandthedegreeofdegreeofintelligibilitytobediscussedinthis sectionrefertotherespondents’scoresontheWordCategorizationtest.Inotherwords,therespondents’scoresonthe Word Categorization test were used as a tool to indicate the degree of functional distance among the 10 South Ethiosemiticlanguagevarietiesaswellastodeterminethedegreeofintelligibilityamongthelanguagevarieties.Inthis section,therespondents’scoresontheWordCategorizationtestarepresented.InSection1,theintelligibilitywasdefined asthedegreeofcommunicationorunderstandingbetweenthespeakersofrelatedlanguages,inprinciple,withouthaving hadadirectexposuretoeitherofthelanguages.Theassumptioninthepresentstudywasthatthecorrectcategorization ofthewordsintotheirsemanticcategoriesmeasuresthe degreeofunderstanding(atleastatthelexicallevel)ofthe speakersofthelanguagevarieties.

Todeterminethedegreeofintelligibilityamongthelanguagevarieties,a75%intelligibilitythresholdwassetbasedon thesuggestionofGrimes(1995,p.22)andpartlybasedontheconservativenatureofthetestadministered.Hence,a75% ormorescoreintheWordCategorizationtestwasconsideredasconfirmationofintelligibilitybetweenthetestlanguage andthe languageofthetest-takers.71--74%scorewasconsideredas partialintelligibility.Anythingbelow71%was considered as absence of intelligibility. Table 5 shows the intelligibility scores of the participants on the Word Categorizationtest.

AscanbeseenfromTable5,Chaha speakersunderstandEzha,GumerandGura.Endegagnspeakerspartially

understandInor.SpeakersofEzhaunderstandChahaandGumer.Inthesamemanner,Gumerspeakersunderstand Chaha,Gura,EzhaandMuher.GuraspeakersunderstandChaha,Ezha,Gumerand Muher.Inorspeakerspartially understandChahaandfullyunderstandEndegagn.Furthermore,MesqanispartiallyintelligibletoEzha.Muherspeakers understandChaha.Silt’eandKistanearenotintelligibletoanyofthelanguagevarieties.

Table5furthershowsthatthetest-takersdidnotscore100%ontheirownnativelanguagesalthough,inprinciple,itis assumedthatthenativespeakershaveaperfectknowledgeoftheirownlanguage.Theparticipantsunderperformedon theirnativelanguagesprobablyduetonon-linguisticfactorssuchasfatigue,thequalityoftherecordings,lackofattention, noisesinthetestenvironmentandtimepressure.Inordertocompensatefortheinfluenceofthesefactors,adjusted meanswerecomputedfortheparticipants’scoresontheWordCategorizationtest.Itwascomputedbysubtractingthe actualmeanoftheparticipants’scoreontheirownnativelanguagefromthehypotheticalmean,whichisalways100%. Themeandifferenceswerethenaddedtothesameparticipants’scoresonthenon-nativelanguageswiththeassumption thatthefactorsthataffecttheparticipants’scoresontheirnativelanguagesequallyaffecttheirscoresonthenon-native languages.Forinstance,Chahaspeakers,onaverage,scored81%ontheirownnativelanguages,althoughtheyare supposedtoscore100%.Therefore,theadjustedmeanwascomputedbysubtracting81%from100%,whichis19%.

Table3

Correlationcoefficientsofthethreedimensionsofdistance.

Structural Functionala _Perceptual

Phonetic Lexical

Structural Phonetic .874 .804 .853

Lexical .849 .777

Functional .747

Table4

Comparisonofthecorrelationcoefficients.

Comparedcoefficientsa

z-values p.value

rPcpDrPDvs.rPcpDrLD 1.051 .293

rFDrPDvs.rFDrLD .654 .513

a

(21)

Then19%wasadded tothescores ofthe Chahaparticipants onallother languagevarieties.Table 6presents the adjustedmeanscorescomputedbasedontheresultsillustratedinTable5.

BasedontheadjustedmeanspresentedinTable6,ChahaspeakerscanunderstandEndegagn,Ezha,Gumer,Gura, InorandMuher.EndegagnspeakerscanfreelycommunicatewithChaha,InorandMuherspeakers.SpeakersofEzha understandChaha, Gumer, Guraand Muher. Theyalsopartially understandEndegagn and Inor. Gumer speakers understandChaha,Ezha,Gura,MesqanandMuher.TheyalsopartiallyunderstandKistane.Guraspeakersunderstand Chaha,Ezha,GumerandMuher.TheyalsopartiallyunderstandKistaneandMesqan.InorspeakersunderstandChaha, Endegagn,Ezha,andGumer.TheyalsopartiallyunderstandGuraandMuher.Moreover,Mesqanspeakersunderstand Chaha,Ezha,Gumer,KistaneandMuher. Muherspeakers understandChaha,Ezha, Gumerand Gura.Silt’e isnot intelligibletoanyofthelanguagevarieties.

Menuta(2015,p.189)arguesthatthebestcenterofcommunicationisMesqan,basedonthestudyheconductedon sixGuragevarieties--Chaha,Inor,Kistane,Mesqan,MuherandWolane.Inotherwords,accordingtoMenuta(2015), manyspeakersofGuragevarietiesunderstandMesqanbetterthantheremainingGuragevarietiesinvestigatedinthe study.Thepresentfindingcontradictsthisreport.AscanbeseeninFig.10,itisChahathatseemstobethecenterof communication.Chahaisintelligibletosevenofthe10languagevarietiesinvestigated.Silt’ewasexcludedfromthefigure sinceitwasnotintelligibletoanyofthelanguagevarieties.InFig.10,thetwo-directionalarrowshowsthatintelligibilityis symmetrical,whiletheone-directionalarrowshowsthatintelligibilityisasymmetrical.

Thedifference betweenthesetwofindingsmightbeduetovariousfactors.First,thepresent studyonlyusedthe SemanticWordCategorizationtest.Theauthorsrecognizethattestingintelligibilityatahigherlinguisticlevelmayyield

Table6

Theadjustedmeanofthetest-takers_’scoresontheWordCategorizationtest.

Languagea _CH _EN _EZ _GM _GU _IN _KS _MS _MU _SI Chaha 100 77 100 100 100 88 69 65 88 61 Endegagn 81 100 67 67 62 90 67 62 76 52 Ezha 100 72 100 87 96 72 56 60 96 60 Gumer 96 68 93 100 96 64 71 82 96 50 Gura 97 66 93 97 100 69 73 73 93 52 Inor 89 100 82 86 73 100 68 63 73 50 Kistane 65 65 56 74 65 56 100 69 52 39 Mesqan 82 57 86 82 57 57 82 100 78 48 Muher 96 57 88 88 84 65 84 61 100 42 Silt’e 56 56 61 70 56 35 48 48 61 100

a_The _test_languages _are _abbreviated _-- _CH₌_Chaha,_ED₌_Endegagn, _EZ₌_Ezha, _GM₌_Gurmer, _GU₌_Gura, _IN₌_Inor,_MS₌_Mesqan, MU=Muher,SI=Silt’eandKS=Kistane;theresultsareconvertedtopercentage.

Table5

Meanoftheparticipants’scoresontheWordCategorizationtest.

Languagea _CH _EN _EZ _GM _GU _IN _KS _MS _MU _SI Chaha 81b 58 81 85 81 69 50 46 69 42 Endegagn 62 81 48 48 43 71 48 43 57 33 Ezha 80 52 80 76 76 52 36 40 76 40 Gumer 82 54 79 86 82 50 57 68 82 36 Gura 83 52 79 83 86 55 59 59 79 38 Inor 71 91 64 68 55 82 50 45 55 32 Kistane 48 48 39 57 48 39 83 52 35 22 Mesqan 67 42 71 67 42 42 67 85 63 33 Muher 77 38 69 69 65 46 65 42 81 23 Silt_’e 43 43 48 57 43 22 35 35 48 87

a_The _test _languages _are _abbreviated _-_CH₌_Chaha,_ED₌_Endegagn, _EZ₌_Ezha, _GM₌_Gurmer, _GU₌_Gura,_IN₌_Inor,_MS₌_Mesqan, MU=Muher,SI=Silt_’eandKS=Kistane;theintelligibilityresultsareconvertedtopercentage.

b

Theparticipantsdidnotfullyunderstandtheirownvariety.Thiscouldbebecauseofvariousfactorsincludingrecordingquality,timepressure, andlackofattention.

(22)

differentresults.Nonetheless,thepresentstudyoptedfortheinclusionofarelativelylargenumberoflanguagesand examinethemfromdifferentperspectives,ratherthanjustfocusingonintelligibility.Inthisregard,Menuta(2015)included severaltestswhicharepromising.Nonetheless,therearealsoconcernsabouttheapproachestakenbyMenuta(2015). Wesuspectthatthe primingeffectwasnot properlycontrolledfor;the sametestmaterialswererepeatedacrossthe speakers of different varieties; therefore, it is possible that the intelligibility scores were inflated because of the participants’familiaritywiththetestmaterials.Besides,Menuta(2015)testedelderlypeoplewhiletheparticipantsofthe presentstudyweresecondaryschoolstudents.Itcouldbethatelderlypeopleperformedbetteronsomeofnon-native languagesascomparedwiththeyoungpeoplemainlybecauseoftheirlifelongexposuretothenon-nativelanguage varieties.Samplesizecouldalsobeanotherfactor.Menuta(2015)tested12participantsfromeachsite.Thepresent studytested30participantsfromeachsite.Acarefullyselectedsmallsamplesizecouldreflectexceptionalperformance of theparticipantsbecauseoftheirlinguisticabilities. Moreover,duringtestadministrationMenuta (2015)askedthe participantstoprovidewrittenanswers.Itisnotclearhowtherespondentsmanagedtoprovidewrittenanswerssince noneoftheGuragevarieties(exceptSilt’e)hasawritingsystem.

5. Discussion

AspresentedinSection4.2,thecomparisonsamongthemeasuresofthethreedimensionsofdistanceshowthatthe twostructuraldistances(phoneticand lexical)stronglycorrelatewitheach other.Thisimpliesthat thetwostructural measurescanbeusedinterchangeablytodeterminethelinguisticdistanceamongrelatedlanguages.Thepresentstudy

(23)

alsoreportedverystrongcorrelationbetweenstructuraldistanceandfunctionaldistance,althoughdifferentmaterials wereusedtomeasurethetwodimensionsofdistance.Thissuggestsahighdegreeofsubstitutabilitybetweenthetwo dimensionsofmeasuringlinguisticdistance.Moreover,thestrongcorrelationbetweenstructuraldistanceandfunctional distanceindicatesthattherespondents’scoresontheintelligibilitytesthaveastrongconnectionwiththepropertiesofthe structureofthelanguagevarieties.

Giventhatthereisnosignificantdifferencebetweenthecorrelationcoefficientofphoneticdistanceandintelligibility scores,andthatoflexicaldistanceandintelligibilityscores,itseemsthatthereisnodifferencebetweenthetwostructural distancesintermsoftheirinfluenceontheparticipants’scoresontheWordCategorizationtest.Thisfindingisslightly differentfrompreviousstudies,whichreportedastrongercorrelationbetweenlexicaldistanceandfunctionaldistanceas comparedtothecorrelationbetweenphoneticdistanceandfunctionaldistance(e.g.,TangandvanHeuven,2009).Italso differsfrom studies thatreported a stronger correlationbetween phonetic distanceand functionaldistance, but not betweenlexical distanceandfunctionaldistance(e.g.,VanBezooijenand Gooskens,2007).Manyfactors,such as similarityofphonemeinventoryandwordfrequency,maycontributetotherelationshipbetweenfunctionaldistanceand structuraldistance.Therelationshipbetweenthesetwodimensionsisprobablylanguagespecific.Forinstance,insome languages,lexicalsimilaritymightbemoreimportantthanphoneticsimilarity,whileinotherlanguagesaslightphonetic differencemayleadtomisunderstanding.Moreover,thestrongcorrelationbetweenstructuraldistanceandperceptual distanceshowsthatperceptualdistancecanbeusedasanalternative meansofdetermining thelinguisticdistance amongrelatedlanguages,especiallyinasituationwheregatheringreallinguisticdataisdifficult.Similarresultswere previouslyreportedbyGooskensandHeeringa(2004)andbyTangandvanHeuven(2009).Thisisgoodnewsforless studiedlanguagesthatdonothavedictionariesordetaileddescriptionsoftheirlinguisticfeatures.However,thelowlevel ofconsistencyintheperceptualdistancematrixhintsthatthereisariskofusingamereperceptualdistancetomeasure linguisticdistanceamongrelatedlanguages.Thisisbecausetheperceptualperspectiveofmeasuringlinguisticdistance ismoresubjectivethanothermeansofmeasuringlinguisticdistance.AsnoticedbyGolubovićandSokolić(2013),

Abu-Rabia(1996),Abu-Rabia(1998) and Pavlenko (2006),the impact of languageattitudeis alsomore pronounced in

situationswheretherearepoliticaldivisions,stereotyping,andsocialandculturalhostilities.

Furthermore,the close similaritybetween the classifications based onthe three dimensions ofdistance and the genealogicalclassificationspreviouslyprovidedbythehistoricallinguistsimpliesthat,inadditiontostructuraldistances, functionalandperceptualdistancescanalsobeusedtoclassifyrelatedlanguages.Inthepresentstudy,wenoticedvery close similarity between typological classifications and genealogical classifications. This result is consistent with a previousreportbyTangandvanHeuven(2009).Ingeneral,thecorrelationsamongthethreedimensionsofdistance, whicharereportedinthepresentstudy,areconsistentwithstudiespreviouslyconductedonScandinavianlanguages(e.

g.,Gooskens,2005;Gooskens,2007;GooskensandHeuven,2018;GooskensandHeeringa,2004)andonChinese

dialects(TangandvanHeuven,2009;TangandvanHeuven,2009).Thesestudies,ingeneral,indicatethatthedistance amongrelatedlanguagescanbe measuredfrom differentperspectives.It isuptoa researchertochoose theright perspectivebased onvariousfactors,suchas resourcesat disposalandthe desiredstudyobjectives;for example, whethertheaimofthestudyistypologicalorgenealogicalclassification.Ourstudypartlysupportstheclaimthat non-linguists’levelofawarnesscanbeusedasavalidmeansofmeasuringdistancesamongrelatedlanguages,butweare alsocognizant ofthe enduringdebateregarding thevalidityof theperception-basedapproach(seeGoeman,1999) becauseofthelowreliabilityofthe perceptualdistancesweobserved.

The classifications of Ethiosemitic languages based on the results obtained from the structural, functional and perceptualdistancemeasuresshowthatChaha,Ezha,GumerandGuraareverycloselyrelatedlanguages.Mesqanand Muheralsohaveverystronglexicalaffinitywiththesefourlanguages.Thelexicalaffinityamongtheselanguagevarieties wasalsoreportedinMenuta(2015).MesqanandMuherhavealsoclosephoneticandlexicalsimilarities.Kistaneand Silt’earedifferentfromalltheremaininglanguagevarieties.ThisdifferenceisprobablyduetotheinfluenceoftheCushitic languagesonSilt’eandKistane.Thisisanintuitivesuggestion,andtheinteractionbetweenSouthEthiosemiticlanguages andthesurroundingCushiticlanguagesisanissuethatfuturestudiesmayaddress.

ThecomparisonsoftheclassificationsobtainedfromthethreedistancemeasuresshowthattheSouthEthiosemitic languagesunderinvestigationcanbeclassifiedintofivegroups.{Chaha,Gura,GumerandEzha}formagroup.{Muher andMesqan}areverysimilarlanguages;hence,theyformasecondgroup.{InorandEndegagn}consistentlyformthe thirdgroup.{Kistane}and{Silt’e}aredifferentfromallotherlanguagevarieties.Theseclassificationsareverysimilarto classificationspreviouslyproposedbyHetzron(1972),butsomehowdifferfrom others,forexample,Demeke(2001).

Demeke(2001)classifiedMesqanunderNorthGurage,togetherwithKistane.Althoughboththestructuralandfunctional measuresshowthatKistaneandSilt’earedifferentlanguages,thespeakersofthelanguagevarietiesbelievethattheir languagesaresimilartoeachother.Thecausesofthemismatchbetweenthespeakers’perceptionsandthelinguistic realitymeritfurtherinvestigation.

With regard to the intelligibility among South Ethiosemitic languages, the results obtained from the Word CategorizationtestshowthatChaha,Gura,GumerandEzhaaremutuallyintelligible.MuherandMesqanarepartially