• No results found

Data Analysis

In document Cover Page The handle (pagina 108-113)

Chapter III RESEARCH METHODOLOGY

3.3 Data Collection and Analysis

3.3.3 Data Analysis

In order to present the information collected by means of qualitative interviews and quantitative questionnaires in the form of an overall statistical profile of patterns of transcultural health care utilisation behaviour, a thorough data analysis has been performed. Qualitative data have been studied on the basis of text interpretation in order to understand the meanings and interconnections among the cultural expressions used by the community members (cf. Bernard 2002). Data, which were collected on the indigenous classifications of MAC plants and on illnesses, for example, have been analysed schematically and graphically, thereby showing how people organise their knowledge. Audio-recordings have been transcribed and translated, pamphlets have been translated and samples of plants have been dried and compressed using books and papers.

The extensive amount of comparable, quantitative data, which has been collected by means of the formal questionnaire in correspondence with the conceptual model, has been evaluated through methods of statistical data analysis. Following the preparation of the data set - a process specifically explained in Paragraph 8.1.1 - a number of statistical techniques have been applied to the collected data in order to closely analyse patterns of transcultural health care utilisation behaviour and to substantiate qualitative findings. In general, the statistical analysis offers a variety of programmes, which can be used to find certain effects, processes and interactions involved in patterns of health care utilisation behaviour reported by the study population and to ultimately predict people’s utilisation behaviour on the basis of the multivariate model applied to the collected data. Statistical methods have been chosen in accordance with the type of variables, which are included in the respective categories of factors, represented as blocks in the model, with the objective to evaluate the relationships and interactions between and among the variables as well as between the blocks of variables. In other words, techniques of statistical analysis have been selected with a view to document, understand and explain the interaction between the independent variables and the intervening variables in relation to the dependent variables of health care utilisation in the multivariate model, and to ultimately provide a statistically sound indication of the significant variables as determinants of the people’s health care utilisation behaviour.

In the effort to divide the sample population into household members and then to sub-divide into

‘patients’ and ‘non-patients’, the quantitative data, which are collected during the household surveys, have first been transformed into two different data sets: (1) all household members including ‘patients’ and ‘non-patients’ (N=656); and (2) health care utilisation rates of all household members identified as ‘patients’ (N=452). In this respect, the number of households visited during the surveys amounts to 293, whereby the original target of a total of 300 units had to be reduced after the initial process of data cleaning. The two data sets offer information on all blocks of variables identified within the multivariate model for each household member of the sample. Data set one will be further described in Chapters V, VI and VII, whereby the data sets are subject to univariate analysis, namely a descriptive analysis of one variable at the time, and are used to substantiate the qualitative findings. In addition, data set two, which is described in Chapter VIII, provides the basis for the stepwise analyses, i.e. the bivariate analysis, the mutual relations analysis, the multivariate analysis, and the multiple regression analysis performed on the collected data. In this respect, all these analyses applied to the data sets are carried out with the use of Version 20 of the IBM Statistical Package for the Social Sciences (SPSS).

The bivariate analysis forms the point of embarkation for all statistical methods applied on the collected data in a way that it analyses the strength, direction and shape of the relationships between two variables. Assessing the quality of the relationship between a pair of variables, the bivariate analysis highlights to which extent the relationship between two variables exists by chance alone, or indicates a statistically relevant covariation. In other words, the bivariate analysis indicates if and how changes in one variable are met with similar changes in the other variable, and whether these changes occur because of a statistically significant relationship between the variables (cf. Bernard 2002; Field 2009; Leurs 2010). In the context of the present study, the bivariate analysis shows to what extent the score of the dependent variables can be predicted from any of the independent or intervening variables in the model. The bivariate analysis, however, does not allow for any indication of causality, i.e. for explaining which variable actually results in change, not at least because of the fact that any relationship between two variables can - or will - always be affected by other measured or unmeasured variables (cf.

Field 2009).

Based on the type of data collected from the quantitative household surveys, the method of the bivariate analysis is selected in order to analyse the relationship between categorical variables with more than two categories, and to perform an analysis of the related frequencies.

Firstly, the technique of cross-tabulation is implemented, in which two variables have been arranged in a grid, whereby the frequencies of each category are noted in the cells of the tabulation. In this way, it becomes possible to make a detailed comparison between both variables. While the schematic presentation of data in the table allows for an initial interpretation of frequencies and percentages across the cells, the Pearson’s chi-square test (χ2) is used to evaluate the actual relationship between the two variables. The Pearson’s chi-square test of independence of two categorical variables determines whether the two variables within a cross-tabulation are associated and whether such association occurs by chance, or indicates an actual phenomenon within the population at large. In other words, the greater the significance value of the Pearson chi-square statistic, which is marked with χ2, the less likely it is that the association between two variables exists by chance (cf. Bernard 2002; Field 2009; Leurs 2010).

In this way, the Pearson’s chi-square test is based on two assumptions: (1) each respondent contributes to only one category in the cross-tabulation; and (2) no more than 20% of the expected frequencies are below 5 and none of the expected frequencies are below 1 (cf. Field 2009, Leurs 2010). In an effort to comply with the assumptions of the chi-square test statistic, all data included in the cross-tabulation are independent, as each respondent contributes to only one cell within the grid. Furthermore, the majority of expected frequencies, albeit not presented, are either all above 1 or, provided the number of frequencies remained below 20%, above 5.

Nevertheless, in a few cases the variables fail to meet the second assumption, as data are heavily tied or unbalanced, causing the sample distribution of the test statistic to be too deviant from the approximate chi-square distribution. Consequently, the probability of the chi-square test failing to determine a statistically significant relation between variables increases considerably. In order to overcome the loss of statistical power, an exact test is applied to the cross-tabulations, which fail to fulfil the second assumption, thereby solving the problem of approximation in small, unbalanced tables (cf. Field 2009).

Apart from adding the option of exact tests to the present data set, the values of significance of the chi-square test results are adapted to a certain extent. In general, significance values assess the degree of probability, to which a correlation between variables occurs by chance and are usually displayed as probability (p) values. In modern statistics, a probability value of .05, which equals a confidence interval of 95% and implies a .05 probability that the relationship between variables occurs by change, commonly forms the basis of significance tests (cf. Field 2009).

Nevertheless, a number of studies have refined the mere dichotomy of defining values above .05 as ‘non-significant’ and values below .05 as ‘significant’ into a more nuanced classification of significance. In general, p-values above the threshold of .05 may indicate a certain trend among data, while values below the criterion may differ in terms of being significant. In order to avoid an oversimplification of results, the significance values of the Pearson’s chi-square test are arranged along the following scale: χ2>.15 ‘non-significant’; χ2=.15-.10 ‘indication of significance’; χ2=.10-.05 ‘weakly significant’; χ2=.05-.01’strongly significant’; χ2=.01-.001 ‘very strongly significant’; χ2<.001 ‘most strongly significant’ (cf. Agung 2005; Leurs 2010; Djen Amar 2010; Ambaretnani 2012; Chirangi 2013).

In order to further analyse the strength of each statistically significant association between two variables, Cramer’s V can be applied to the data in order to measure the strength of all significant relationships regardless of the number of categories of each variable in the cross-tabulation, The values of Cramer’s V range from 0 to 1, whereby 0 implies that no relationship exists between variables and 1 indicates that variables are perfectly associated (cf. Leurs 2010, Field 2009). In an attempt to adequately explain and interpret the effects between variables identified through a bivariate analysis, conclusions can be drawn from the percentage values presented within each cross-tabulation. Significant associations can basically derive from proportionally small differences in cell frequencies whereupon percentages rather than frequencies form an appropriate means to detect patterns within the data (cf. Field 2009). The bivariate tables indicating the distribution of two selected variables are interpreted on the basis of percentage values.

Subsequently, the mutual relations analysis is performed in which all statistically significant associations between variables are indicated in the block of variables in the multivariate model of transcultural health care utilisation. In view of the processes involved in the bivariate and mutual relations analysis, the statistical procedures involved and the results gained from the main component of the statistical data analysis applied to this study provide a sound basis for further consultation and interpretation of the results obtained from subsequent multivariate data analyses.

Although a bivariate analysis offers a significant insight into the relationship between independent, intervening and dependent variables, it fails to take into account the mutual and simultaneous interactions between all variables included in the multivariate model. Multivariate statistics form an extension of univariate and bivariate methods of data analysis by offering techniques to analyse more complicated data sets, in which variables are correlated to varying degrees. The multivariate statistic applied to the present study refers to the nonlinear canonical correlation analysis, using the technique of OVERALS which has been developed at the Department of Data Theory of the Faculty of Social Sciences of Leiden University (cf. Gifi 1990).

In particular, classical canonical correlation analysis assesses the relationship between two sets of variables, frequently referred to as independent and dependent variables, in which both sets contain multiple variables. As Gifi (1990: 193) elaborates: ‘Any variable may contribute to the analysis in as much as it provides independent information with respect to the other variables within its own set and to the extent that it is linearly dependent upon other variables in the other set’. In order to study the relationship between sets of variables, each set is first combined into a predicted value, namely an underlying dimension within the set also referred to as underlying factors or linear components, in which variables cluster together in a meaningful way. Clearly, the number of variables within each set produces a specific number of possible combinations, namely dimensions, whereupon the researcher determines the number of dimensions necessary for analysis.

Multivariate analysis generally produces reliable results on the first two dimensions, whereby the strongest combinations are found on the first dimension. In this way, canonical correlation analysis links up with the techniques of factor analysis and principal component analysis in the sense that it produces, across all subjects, the highest correlations between the predicted value and each selected variable, whereby an estimation is made to which extent the variable contributes to the underlying set of variables. The coefficients, which result from the correlations between the predicted values and the single variables, are known as ‘factors’ or ‘component loadings’. On the whole, this analysis aims at identifying items, which load high on the respective factor or component (cf. Tabachnick & Fidell 2001; Bernard 2002; Field 2009).

In classical multivariate analysis, variables are usually assumed to have an a priori quantification and have to be treated at the interval level of measurement whereupon the transformation graph takes the shape of a straight line. The multivariate technique applied to this study, however, allows for the inclusion of variables, which have been measured at the nominal or ordinal level, whereby transformation graphs are produced in a shape other than straight lines (cf. Van de Geer 1993b). The OVERALS method describes a canonical correlation analysis, which is nonlinear and which assigns nonlinear transformations to the respective variables in order to be able to maximize correlation. In general, OVERALS forms an appropriate method of multivariate data analysis for the present study, particularly since a number of comparable studies have applied this method successfully. In these studies, research has also been conducted on patterns of behaviour by means of examining the relationship between variables, which are identified in an overall multivariate model (cf. Agung 2005; Ibui 2007; Djen Amar 2010; Leurs 2010; Ambaretnani 2012; Chirangi 2013).

Apart from serving as the main method of multivariate analysis, the OVERALS technique is also implemented during the last stage of the statistical analysis performed on the collected data, during which an explanatory model of health care utilisation is developed, which allows for an ultimate prediction of people’s utilisation behaviour. The analysis is no longer focussed on the relationship between variables, but on the correlation between the different blocks of variables, which have been identified within the multivariate model. First, the OVERALS analysis is performed on each block of variables in order evaluate the correlation between two blocks of variables in the model. Subsequently, the eigenvalues retrieved from each correlation are inserted into a formula, which produces the multiple correlation coefficients (ρ) indicating the strength of the relationship between the blocks of variables (cf. Field 2009; Leurs 2010). Since a multiple regression analysis provides a rather useful technique for the study of the relationship between blocks of independent and dependent variables, the amount of variation in the block of dependent variables, which is accounted for by the model, can be estimated. The description of the process of the stepwise statistical analysis concludes with a presentation in the ultimate model of the values of the multiple correlation coefficients, which highlight the characteristics of the processes related to the patterns of transcultural health care utilisation behaviour of the population of rural Crete.

Notes

(3.1) The system level of health care utilisation behaviour refers to be concept of medical systems and needs to be distinguished from the notion of health care systems, which relates to institutions offering medical services.

(3.2) According to a recent study conducted among the Mediterranean islands, people with a lower SES are more likely to adopt unhealthy patterns of behaviour, such as smoking and unhealthy dieting patterns, and to have a worsened psychological profile, namely to show depressive symptoms. By consequence, individuals tend to develop risk factors for cardiovascular diseases, such as hypertension, hypercholesterolemia and obesity as well as a high body mass index, more frequently than people with a higher SES, who appear to have a greater adherence to the Mediterranean Diet (cf. Panagiotakos et al. 2008).

(3.3) Since concepts are commonly defined by more than one variable, any statistical measurement process eventually applied requires a testing of correlations between the different variables assigned to each concept (cf. Bice et al. 1976; Leurs 2010).

(3.4) The grouping and lineup of categories of the variable ‘Occupation’ follow the order applied in the

‘International Standard Industrial Classification of All Economic Activities (ISIC), Rev. 4’, which has been developed by the Statistics Division of the United Nations (UN) (cf. United Nations 2013).

(3.5) In this respect, an initial research strategy included a comparison between the area of South-Central Crete and the tourist town of Malia at the north coast in the Prefecture of Iraklion. Nevertheless, these initial considerations have been abandoned following limitations of time and feasibility.

(3.6) In order to acquire proficiency in Greek, the researcher followed a one-semester beginners course in Modern Greek in 2009 at the ‘Bibliotheek plus Centrum voor Kunst en Cultuur (BplusC)’ in Leiden, The Netherlands, and took intensive private lessons in Modern Greek for a period of approximately three months in 2011.

(3.7) The Easter celebrations held right at the beginning of the fieldwork stage rendered it possible to assist in a number of community gatherings and festivities and to become acquainted with the target population in a rather short period of time.

In document Cover Page The handle (pagina 108-113)