• No results found

Opportunities of GIS in BI

N/A
N/A
Protected

Academic year: 2021

Share "Opportunities of GIS in BI "

Copied!
86
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

Opportunities of GIS in BI

An introduction in GIS for the general BI-er

University of Groningen Faculty of Management and Organization

Msc Technology Management

March 2007

Alina Rozeboom

(2)

Opportunities of GIS in BI

An introduction in GIS for the general BI-er

Student Alina Rozeboom s1336223

Aquamarijnstraat 779 9743 PV Groningen

A.M.Rozeboom.1@student.rug.nl

Opdrachtgever Ordina VisionWorks M. J. van Fulpen Supervisor R. Meijer

University University of Groningen

Faculty of Management and Organization Educational program Msc Technology Management

Supervisors Dr. T. W. de Boer Dr. Ir. R. de Graaf Date 30 March 2007

(3)

327 279

245

167

352

210 187 589

246 467

368

675

Management abstract

It is likely that the customers of Vertis pay limited attention to the geographic aspect of their data. It is estimated that about 80% of all data has a geographic element. Think about it, customers have addresses, as well as employees and suppliers. This research shows how this aspect of data can be supported, using geographic information systems in the context of a BI-infrastructure and what benefits it can bring.

The question that will be answered is:

How can GIS support the three different BI-infrastructure ideal user types?

To answer this question, we first need to understand what GIS is. A GIS is: Geographical information systems are tools that allow for the processing of spatial data into information, generally information tied explicitly to and used to make decisions about some portion of the earth.1

GIS differ from ordinary IS mainly on their data types, data model and actions that can be executed on the data. A GIS supports analyses by providing visualization capacity: creating and presenting maps, but also with other analyses methods like calculations.

Now we know that a GIS can support decision making with visualization. How can geo-visualization do that?

Decision making is supported when the right representation method is used. Diagrams, such as maps can speed up search, support recognition and support deducting new information, thereby supporting decision making. See the example, table 1 shows the number of doors sold by TheDoor in 2006.

numbers of doors sold

Drenthe 279

Groningen 245

Friesland 327

Noord-Holland 467

Zuid-Holland 246

Zeeland 589

Noord-Brabant 187

Limburg 210

Flevoland 368

Overijssel 167

Gelderland 352

Utrecht 675

Table 1: Numbers of doors sold in the Netherlands in 2006

Figure 1:Geo-visualization of numbers of doors sold by TheDoor in 2006

When the data from table 1 is geo-visualized new information can be deducted, see figure 1. This figure clearly shows that the western part of the Netherlands buys more doors from TheDoor than the northern part.

1 DeMers M.N., 2002, 7

4 best selling provinces 4 middle selling provinces 4 worst selling provinces

(4)

Than the question; can it be realized? Yes it can be realized, GIS can be incorporated in a BI-infrastructure. But GIS is very different to BI- infrastructures it is not easy. Currently a GIS can be “integrated” in the BI-architecture with use of a GIS-application. This GIS-application performs a “mediator” function; it performs the translation between BI-tools and GIS. There are commercial GIS-applications available such as MapIntelligence (Brio).

Since not all BI-infrastructure users need the same functionalities and thus not the same support, for which users can GIS be beneficial? Three types of users are distinguished. The ideal type MIR is characterised as a user that wants performance information about the organization, on an aggregated level, on a periodical basis. He does not need to use the BI-infrastructure to analyse the data, thus the reports do not need to be dynamic. He wants an answer to the question whether the organization performs as it has to. He can only benefit from the geo- visualization capacity a GIS delivers but does not need it that much.

The ideal user type RMA wants more analyses capacity compared to the MIR user type. Besides the question:

does my organization perform as it should, he also wants to know what the reasons are, when the organization performs below average or below target. In order to support this, the BI-infrastructure has to be dynamic and provide analyses capacity such as the ability to perform mathematical calculations. This ideal type benefits the most from a GIS because he needs to perform analyses, GIS supports that.

The ideal user type PMA differs from the other two types by the nature of the analyses he wants to perform. He is interested in using the BI-infrastructure to created hypotheses about the future. This influences the functionality needed, besides “standard” calculation functionality and a dynamic tool he needs other functionality like what-if forecasting, data mining and text mining and statistical analyses. This ideal type user benefits no only from geo- visualization but also from al types of spatial analyses. However this ideal type user exists less than the other types and is therefore, not considered as the best target group for Vertis. It is better to focus on the RMA user type as client because they will benefit greatly from GIS and exist more in real life.

The largest benefit a GIS can deliver is the geo-visualization capacity because all users benefit from this capacity.

The RMA ideal user type benefits the most from a GIS because it needs more analyses functionality compared to the MIR user type and GIS can support these functionalities. Compared to the PMA user, RMA users exist more so it is better for Vertis to focus on them.

(5)

Preface

Before u lies the result of a half year of hard work trying to understand both the business intelligence world as well as the GIS world. Business intelligence proved to be an interesting area of research because it requires knowledge of a lot of different areas of expertise. A characteristic that also describes the educational program Technology Management. GIS however intrigued me even more because it is fascinating how humans can grasp so much of the world in a machine.

In my journey of gaining knowledge about business intelligence I received a lot of support from Ronald Meijer, my supervisor and BI-consultant at Vertis. All other employees of the department InControl were also always willing to answer my many questions, thanks!

I would like to thank Ciaran Jetten and Maarten-Jan van Fulpen for giving me the opportunity to carry out this research at Vertis. Our inspiring conversations helped a great deal in defining the problem and the research question.

Gerjan Walrecht deserves an extra round of applause for his patience with my questions regarding GIS. It wasn’t always easy to explain GIS to a layman, I thank you.

For our discussions about the relevance of this research, the methodology and other aspects I would like to thank Thomas de Boer as my supervisor and Rob de Graaf as reviewer. Thomas your knowledge and associative mind made it much easier to find relevant information about this relatively complex research area. Rob, many thanks for your criticism, it forced me to re-evaluate my work and, in time and with little steps, brought it to a higher level.

What made this project a little bit special compared with other final master theses is that it is executed under the flag of the Talentproject. I was very fortunate to be accepted into this program which allows students and organizations in the northern part of the country to exchange and create knowledge. This is realized using a very old concept: that of the expert and his pupil. Ronald Meijer was my expert and as such he showed me what the work of a consultant consist of. With reference to this the Talentproject I would like to thank Maaike Jongsma for advice and support regarding my learning goals.

This thesis is written as completion of my study Technology Management. Although it turned out to be a here and there somewhat technical story I tried to explain everything as clear as possible. I hope you enjoy reading this thesis as much as I did writing it!

Alina Rozeboom

March 2007, Groningen

(6)

Table of contents

1 Introduction 8

1.1 Research motive 8

1.2 Thesis structure 9

2 Preliminary investigation 12

2.1 Business intelligence 12

2.2 GIS 18

2.3 The customer 21

2.4 Research implications 23

3 Problem statement and research approach 24

3.1 Problem definition 24

3.2 Scientific relevance 24

3.3 Research approach 26

3.4 Conclusion 29

4 What is GIS? 30

4.1 Back Room - data types 30

4.2 Back Room - Data modelling 35

4.3 Front Room - Visualization 36

4.4 Front Room - Analyses 37

4.5 Power of GIS 38

4.6 Conclusion 39

5 Why geo-visualization? 40

5.1 Visualization in general 40

5.2 Geo-visualization 43

5.3 Conclusion 47

6 Can GIS and BI work together? 48

6.1 LVE – mediator solution 48

6.2 Additional comments 50

6.3 Prediction of future 51

6.4 Conclusion 51

7 What are the characteristics of the different user groups? 52

7.1 The DOOR 52

7.2 BI-user groups and their characteristics 53

7.3 Conclusion 61

(7)

8 What are the current possibilities for BI & GIS for the user groups? 62

8.1 GIS functionality 62

8.2 GIS for the different user groups 62

8.3 Overview and opportunities 71

8.4 Conclusion 73

9 Conclusion and recommendations 74

9.1 Answers on the sub-questions 74

9.2 Conclusion 76

9.3 Recommendations 77

10 Discussion 79

Source acknowledgement 80

List of figures 82

List of tables 83

Glossary 84

(8)

1 Introduction

This research is executed on behalf of Vertis. Vertis is a service organization that advises organizations regarding business intelligence. Besides advice, Vertis also facilitates in creating a business intelligence (BI) infrastructure.

Business intelligence is a relatively new concept, but it is growing very fast. According to Den Hamer, Business intelligence is:

“Business Intelligence is the focused process with corresponding facilities to collect and analyze data and, using the, as a result of this process acquired information2.

This means as much as making informed decisions based on the information available. Off course all organizations tried to do this for ages but with the development of information technology, the availability of tools that support this process grows as well as the data that is available for making the decisions.

At Vertis, the idea exists that customers could benefit from a geographical representation of their data. Because geographic information systems (GIS) are specialized in geographic representation a logical idea is then: can we combine them? In the research motive (§ 1.1) it will be explained how and why Vertis came with this idea. After the research motive, the structure of the thesis will be clarified (§1.2)

Note

This research was started at InControl, the business intelligence department of Vertis. Since January 2006 Vertis is taken over by Ordina. The largest part of Vertis continues as an “independant” organization called: Ordina Oracle Solutions. Several departments of Vertis, including InControl are transferred to other Ordina organizations.

InControl is now a business unit of Ordina VisionWorks.

For the remainder of this thesis the name Vertis will be used to indicate the client.

1.1 Research motive

A small but existing question from the market initiated this research. One client of Vertis wanted part of his management information presented in a map. LVE is the Landelijke Vereniging Entadministraties, (National association of vaccination administration)3 and is responsible for the registration of vaccination in the Netherlands.

The registration of the vaccination for baby’s, against for example the mumps or tetanus, is, in the Netherlands, done by 10 administrations. To get a clear view of the total degree of vaccination the LVE wanted an IS that would make the degree of vaccination visible by using a map of the Netherlands. Since this was the first client Vertis developed an application for that included a geographical element, the road to the completion of this project was long and not without problems. The departments InControl and GIS (geographic information systems) both worked on this project together. Currently the GIS part of this solution still does not completely work.

Besides this example, which shows a market driven development, also a technology push can be identified. There are a few large business intelligence software suppliers that recognized a possible new market. Microsoft is developing an update of its reporting tool, this tool provide extended visualization features, such as adding a map of the production floor to an organizations management reports.

Oracle, famous for its database, extended their database with the possibility to store geographic data types.

Previously storing geographic data types was only possible in GIS.

2 Den Hamer

3 Internal documentation

(9)

Smaller organizations are also developing tools which are said to provide for the integration between BI and GIS.

An example is MapIntelligence developed by Integeo and sold by Brio. This product claims to enable the integration between BI and GIS. This product is said to provide “interactive mapping applications in real time based on information from business dashboards.”4

The dashboard manager of Business Objects (BO, a large BI software vendor) also provides, although very limited, geo - visualization options.

Based on all these developments Vertis became interested in GIS. Therefore the initial problem statement was:

How can Vertis become successful on the cutting edge of BI and GIS?

This problem statement is meant to give some direction to the research. It is on purpose very broadly defined because the area of the combination between GIS and BI is unknown. Therefore this problem statement is the starting point of a preliminary investigation. This preliminary investigation shall lead to a problem statement that represents the actual question that will be answered in this thesis.

1.2 Thesis structure

The thesis will have the following structure:

Figure 2: Thesis structure

The starting point of this research is characterised by 1) a very broad initial problem statement

2) limited knowledge on both BI, GIS and the BI customer of Vertis 3) a restricted research time

Therefore this thesis will be continued with a preliminary investigation in order to get a clearer view on the situation.

4 Map Intelligence, Brio, http://www.brio.nl/subpage.aspx?l1=238&l2=617 , 4 January 2007 Preliminary

investigation

(10)

In this preliminary investigation attention is paid to:

- A general overview of BI - A general overview of GIS - The customer of Vertis

Based on the introduction and the preliminary investigation the research question is defined. In this problem statement the goal of the actual research is stated as well as the main research question. Furthermore sub questions are presented and a method which describes how the main question is answered. Last but not least attention is paid to the scientific relevance of this investigation. It is clear that Vertis will benefit from this research but what will this research add to other scientific researches on this area of interest.

Research

GIS Visualization GIS & BI Characteristics user

groups

Possibilities for the user groups

H4 H5 H6 H7 H8

In the chapters following the problem statement and research approach the research questions are answered.

Composition of the thesis:

H4) answer to question: What is GIS?

H5) answer to question: Why geo-visualization?

H6) answer to question: Can GIS and BI work together?

H7) answer to question: What are the characteristics of the different user groups?

H8) answer to question: What are the current opportunities for the different user groups?

The thesis is composed as follows; a start is made with an explanation of GIS because it is necessary for the readers to understand that GIS and BI are two different areas of expertise and why they grew so far apart. This chapter provides the answer to the first question. The next chapter provides and answer to question 1b: why geo- visualization. This chapter explained why geo-visualization can be of interest to decision support.

Then chapter 6, can GIS and BI work together? Chapter 7 gives an answer to the question: “What are the characteristics of the different user groups?” The last question is answered in chapter 8.

Research question

(11)

Based on the answers found on the research questions conclusions are drawn in this chapter. Next to the conclusions some recommendations are presented.

Conclusions and recommendations

(12)

2 Preliminary investigation

At Vertis, the idea exists that customers could benefit from a geographical representation of their data. A logical idea is then: can we combine elements from GIS and BI to deliver that to them. Therefore the initial problem statement was:

How can Vertis become successful on the cutting edge of BI and GIS?

This chapter is meant to explore the area of BI and GIS in order to define a problem statement in the next chapter. Four questions are answered;

1. What is BI? §2.1

2. What is GIS? §2.2

3. Who is the customer of Vertis and how to consider this aspect in this thesis? §2.3

4. What are the implications for the research statement? §2.4

At first it is explained, in short, what BI is and how this influences the initial problem statement. Then, a preview of GIS is given. And because this initial problem statement implies that Vertis would like to become successful on this area, attention must also be paid to the customer. At the end of this chapter implications for the research statement are presented.

2.1 Business intelligence

Because this investigation is about BI, at first BI is explained. The goal of this paragraph is not to find the perfect definition of Business intelligence, but to understand the subject of this thesis. A few definitions;

Den Hamer5; “Business Intelligence is the focused process with corresponding facilities to collect and analyze data and, using the, as a result of this process acquired information.

Vriens & Philips6; “BI is the process of using and obtaining information for strategy development of organizations.”

Turban & Aronson7; “BI is the process that leads to improved decisions and the creation of knowledge.”

The differences are that Vriens and Philips focus on delivering information for strategic decisions. Den Hamer, however, focuses on the use of this new found information. And De Brock uses this information for all sorts of decision making. Also the phases these authors state are a little bit different. But what is most interesting is that only Den Hamer includes support to make this process possible into the definition. Because all the others don’t mention IT in their definition, the following difference is made in this thesis. In general there is a business intelligence process and the IT-support to make this process possible is called a BI-infrastructure. Further explanation can be found in the following sub-paragraphs.

5 Den Hamer, P. 2005

6 Philips E., Vriens D. 1999

7 Turban E., Aronson J.E., 2001, page 131

(13)

2.1.1 The BI-cycle

BI is a cycle process; this means that it is not a clearly defined project with a beginning and an end.8 The phases that are stated by the different authors are not the same. Similar phases recognized between the described cycles are gathering information, analyzing information and using this information. Philips and Vriens add the phase

“richten” which means, freely translated, deciding which data is needed for what proble; in the thesis this is called focusing. Their phase categorization is used because focusing is different from gathering data and it is possible that GIS supports the phase processing but not to the phase gathering data or visa versa. Den Hamer accentuates that all phases have to be executed.

Figure 3: BI-cycle

It is important to understand that the BI-cycle is not complete when the information is gathered and analyzed but not used. This implies that BI is not just a matter of having the IT facilities. IT can support the phase’s information gathering, and analyzing this information. The other phases: deciding what information is needed and the use of the information found usually need human interaction. This is important, because when humans need to make decisions upon given information, they have to understand the information.

2.1.2 BI an umbrella concept

Since there are several definitions on BI and a huge amount of BI related concepts it is more or less an umbrella- concept. In order to create more insight in these concepts a list of the most relevant concepts is presented in this sub paragraph:

BI – cycle = BI is the process of focusing, gathering data, analyzing data and using information BI – infrastructure = BI -infrastructure is the IT (both software and hardware) that supports the BI cycle BI – customers = customers (current and future) of Vertis that want a BI - infrastructure

BI – tools = software and hardware that support the BI cycle BI – organization = the organization of the BI- cycle

BI – architecture = the architecture of the BI - tools

BI – ambition = the ambition level of organization concerning BI

BI – maturity matrix = a matrix that is developed to establish the maturity of an organization concerning the BI – architecture, BI-organization and BI-ambition

8 Den Hamer P., 2005

(14)

2.1.3 Vertis and BI -> the BI-infrastructure

As stated in paragraph 2.1 the IT facilities supporting the BI-cycle are (in this thesis) called; the BI-infrastructure.

To understand what a BI-infrastructure contains and what it does not contain, a comparison is made with IT- facilities mentioned in theory.

The solution delivered is an information system which enables it to perform ad hoc analysis but also delivers standard reports. At Vertis this is called a management information system (MIS). Theoretically this is not correct, according to Laudon & Laudon9 a MIS is a management information system, their definition is:

MIS = information systems on management level within an organization that support the functions of planning, control and decision making by delivering overviews and reports of deviations.

This definition does not completely cover everything Vertis delivers with their BI-infrastructure. Based on what the customer wants this is extended with ad hoc analyses, data mining etc. This looks more like a decision support system (DSS). The difference between a MIS and a DSS is that with a DSS non-structured problems can be handled. Originally this was not the difference. According to McLeod10, the difference between a MIS and a DSS was that a DSS assists a group of managers in making a single decision. A MIS on the contrary would help managers in making decisions so that they could solve problems. Thus a company would need only one MIS and many DSS.

According to Turban & Aronson11 a lot of definitions exist for a DSS, each based on another concept. But the essence is that a DSS is an approach for supporting decision making. This is more or less the same as with BI, BI is the process and a BI-infrastructure makes this process possible.

Laudon and Laudon state that the difference between a MIS and a DSS is that a MIS usually delivers only standard reports12 . An executive support system (ESS) is considered to support the executives of an organisation13.

The differences are the following14;

Type of

systems

Information input Processing Information output

Users ESS Collected data; external

and internal

Diagrams, simulation, interactive

Predictive Higher management MIS Limited amount or data

of large databases optimalised for data analyses; analytical models and tools for data analyses.

Interactive, simulations, analyses

Special rapports, analyses for decision-making, reactions on questions

Professional;

staff- employees

BI-infrastructure

DSS Abstracts of transaction- data, large amounts of data, simple models

Routine rapports, simple models, simple analyses

Abstracts and rapport exceptions

Middle management

Table 2: Comparison Information Systems

9 Laudon K.C., Laudon J.P., 2002, 43

10 McLeod R., 1994, 23

11 Turban E., Aronson J.E., 2001, 97

12 Laudon K.C., Laudon J.P., 2002, 443

13 Laudon K.C., Laudon J.P., 2002, 43

14 Laudon K.C., Laudon J.P., 2002, 43

(15)

But because the clients of IC usually want all levels of the organisation provided with information essentially all three types of systems are covered in one. For each customer the proportion is different. For the remaining of this thesis the BI-infrastructure is defined as:

BI - infrastructure is the IT (both software and hardware) that supports the BI-cycle

The BI-infrastructure supports the BI-cycle. This is a very broad definition because it includes everything from databases to BI-tools.

2.1.4 The architecture of a BI-infrastructure

Because the goal of a BI solution is to support the BI- cycle it must support focussing, gathering and the use of data. The data that must be gathered is usually stored on many different information systems (IS). These IS are usually transactional systems that are designed to efficiently store and process large amounts of data.

For a number of reasons it is best to build you’re BI solution not on these transactional systems but on a data warehouse (DWH). The goal of this data warehouse is; “support for all levels of management decision –making processes through the acquisition, integration, transformation, and interpretation of internal and external data.”

According to March and Hevner15, a data warehouse is:

Data warehouse; a subject oriented, integrated, time variant, non – updatable collection of data used to support management decision making processes and business intelligence.

This should not be confused with data warehousing which is the development, management, operational methods and practices that define how these data are collected, integrated, interpreted, managed and used.

It should also not be confused with a database. A database is the “container” in which the data is stored. The data warehouse is a database which has the data stored in such a way that it is particularly useful for building rapports on it.

BI on transactional systems is not recommended, because when you want to do analysis on an IS that at the same time is in use for the processing of purchase-rules the system becomes really slow and may not be able to handle both actions. It is also not recommended because of performance issues. The data model is not designed to do analyses on. Also for historical analysis a DWH is recommended, a transactional system stores data just to make transactions possible.

Kimball16 developed an architecture-model that is broad and functional and can be used to explain how BI- infrastructures usually look like.

15 March S.T., Hevner A.R., 2005

16 Kimball, 1998, 329

(16)

Figure 4: Kimball Data warehouse architecture

This architecture shows that the data is transferred from the source systems (mostly transactional systems) using an ETL process. ETL stands for: extract, transform and load. This process ensures that the right data is set on the right place in the data warehouse. The analysis and reporting tools use the data stored in the data warehouse;

therefore the data that is stored must be true and consistent.

A BI solution has a front and a back room. The data is stored and prepared in the back, and presented in the front. When spatial-data is involved in the BI-infrastructure you can imagine that both in the front and the back there must be some adjustments to deal with this type of data.

Besides that the influence of the arrangement of the back room on the functionality and the efficiently of the front room is enormous.

A BI- data warehouse architecture that is often drawn to make it easier to understand what a BI infrastructure is built out of, is the following:

(17)

Data source Data source 2

1

Data source 4 Data source

3 ETL ( extract, transform, load) Analysis

tool

Reportages tool

Cube DWH

Figure 5: example BI-architecture

The source systems presented by Kimball in the upper left corner of his data warehouse architecture are presented in figure 4 as data sources. The data within these data sources are extracted to the data staging area in which the data will be transformed and loaded in the datawarehouse. According to Kimball the data staging area includes a job control, this process is generally not explicitly named. The ETL-process is generally used as the term that stands for the processes that data have to go through before they “enter” the datawarehouse.

2.1.5 The BI-infrastructure as toolkit

Another useful way of looking at the BI-infrastructure is considering all BI software tools to support the BI-cycle.

Based on the architecture model of Kimball we can say that a BI-infrastructure is divided in two rooms, a back and a front room. The back room consists of the ETL process and the data warehouse. Both are tools that provide the BI-infrastructure with a solid basis.

Tools in the front room are reports and analyses services. The front room can be arranged just the way as the customer wants. There are lots of techniques and tools that can be used for the different demands customers have. These tools/techniques can be categorized using the upside down pyramid displayed in figure 5.

(18)

Figure 6: Front-end tool categorization (by Ronald Damhof)

Based on this pyramid we can see that with the increasing complexity of tools the number of users decreases.

Besides that, the least complex tools are easy to use, usually static instead of interactive.

2.2 GIS

Now that BI is explained, the focus of attention shifts to GIS. What exactly is GIS and how is it used? A start is made with definitions, essential characteristics are explained and some attention is paid to geographical data.

2.2.1 GIS defined

There is no absolute agreement upon the definition of GIS due to the fact that the field of geography is difficult to define and represents integration with a lot of different subject areas17. DeMers defines GIS this way;

Geographical information systems are tools that allow for the processing of spatial data into information, generally information tied explicitly to and used to make decisions about some portion of the earth18.

Definitions from Geographic Information Systems and science19 1. GIS is a container of maps in digital form

2. A computerized tool for solving geographic problems 3. A spatial decision support system

4. A mechanized inventory of geographically distributed features and facilities

17 DeMers M.N., 2002, 7

18 DeMers M.N., 2002, 7

19 Longley et al., 2001, 72

(19)

5. A tool for revealing what is otherwise invisible in geographic information

6. A tool for performing operations on geographic data that are too tedious or expensive or inaccurate if performed by hand.

What these definitions have in common is that a GIS uses geo-referenced data to support decision making. This is exactly the characteristic that this thesis focuses on. In essence. the paper discusses how geo- referenced data can be supported in the BI-cycle. The definition of GIS is used in this thesis because he accentuates both the processing of spatial data into information and decision making.

But then, when is GIS necessary, when can GIS add value to the support of decision making? To understand this, the essential characteristics of GIS are shortly explained and special attention is paid to spatial data.

2.2.2 Essential characteristics of GIS

GIS are developed originally to digital store maps. The first GIS was the Canada Geographic Information System designed in the mid 1960s.20 This GIS was meant to store information about the nation’s land resources and specifically make it possible to have measuring data about the country.

The development of GIS started with the storage in a binary file to nowadays storage of geographic data in a geo- database. This geo-database makes it possible to store all geographic data in one database which supports accuracy. A geo-database is not different from any other database, besides that it provides for the storage of geo- data types. (Geo-data types will be explained in chapter 4)

This already shows one of the key elements of a GIS; geographic data. What exactly is geographic data, and what is the difference with normal data? This is explained in the next sub-paragraph.

Another difference between a geo-database and a ‘regular’ database is the key used to link the data. In a geo- database the link is a reference with the earth. Every event or building can be linked to a place on the earth; this place is referred to, for example, the co-ordinates of the Rijksdriehoeksmeting, a co-ordinate system.

This makes it possible to lay maps over each other in order to find relations between for example economical data and nature. When a map with data about net income is linked with a map that shows where the country is urbanized and where nature still dominates, one can find interesting information such as people with a high income seem to live in a neighborhood with more trees than houses or vice versa. These layers, maps, are linked to each other using a co-ordinate system.

Main characteristics of a GIS are;

1. The possibility to store geographic data 2. Geo reference as linkage type

3. spatial analysis supported

About this analysis: a GIS supports analysis of geo-data. It becomes possible to calculate in m^2 the surface of an area. And then, when one border of this area changes, you do not have to change the table with surfaces but these are calculated automatically. Also more sophisticated analysis becomes possible (of course with a highly developed GIS) like: if the number of foxes in a specific area is changed, what is the influence on the surrounding areas. The influence can be: a specific tree type returns or there are less rabbits.

Analysis like this can also be interesting for organizations: when our organization opens a store in a specific town, then what is the influence on sales in the surrounding shops?

20 DeMers M.N., 2002, iii

(20)

You can look at GIS also as having a back and a front room; however the tools and techniques and models of a GIS differ from the ones of a BI-infrastructure. Therefore it is not so easy to use current GIS tools without adjustment as a front room tool in your BI-infrastructure. GIS’s are developed with a specific goal in mind, which differs from the goal of a BI-infrastructure. Logically they are arranged differently in their back room to ensure optimal performance in their front room.

2.2.3 Spatial analyses

According to Longly et al., spatial analyses are all analyses, he results of which change, when the locations of the objects being analyzed change21. Thus a calculation of the average income of a group of people is not spatial analysis. The calculation: ‘” where lies the center of the highest net-income?” is spatial analysis.

It is important to understand that spatial analysis differs from data-analysis. Data-analysis is highly developed in the BI-scene. Spatial analyses are developed in the GIS-scene. The difference is that spatial analysis has results that change if the location of the objects being analyzed changes.

It is also important to realize that there is no rule on when to use a GIS and when not. For some analyses the approach of a GIS is best and for other questions the approach of a BI-infrastructure is better. But both types of systems approach questions from a different viewpoint, which is not developed that way coincidently. A detailed explanation will be given in §4.4.

2.2.4 Special spatial data

As stated in the previous paragraph GIS can store geographic data, but what is that exactly? According to Raper22 Geographic information is defined by the use of spatial and temporal referencing to characterize information. In other words: information is geographical information, when it is characterized by the use of spatial and temporal referencing. Based on this definition almost all information is geographic information which means all information can be handled in the most commonly used system GI.

Goodchild makes the difference between geo-referenced information and geographic information. Geo-referenced information is any phenomenon that can be geo-referenced and geographic information is representations of geographic entities. When supporting this theory everything becomes clearer.

The definition of Goodchild is used. In this paper the difference between geo-referenced and geographic information is supported. This in order to make it very clear that it is not possible to present maps with the data most companies have. This data is just the administrative that is geo-referenced; to present this on a map you have to have a map. The data that describes this map is the “real” geo-data. Most companies have geo- referenced data but no geographic data. It is important to use this distinction because when a GIS tool is used to represent this geo-referenced data, geographic-data is needed.

To make this theory understandable, one can explain these two concepts with the following example. All organizations have data about their clients. For example addresses; the address of a client is geo-referenced data. When these addresses have to be plotted in a map, a map is needed. Geographic data is data that describes the earth, for example: points, lines, or areas. When this client lives in Groningen for instance, the coordinates and lines to describe where Groningen lies on the earth and the shape the city has are geographic data.

21 Longley et al., 2001, 278

22 Raper J, et al. 2001, 39

(21)

The distinction is important, because the storage of geo-referenced data is already made possible for many years.

The storage of geo-data is possible, mainly in GIS packages. But the combination of these data types to get new information is developing.

Summarizing:

Geo- referenced data = administrative data

Geo-data = spatial data = geographic data = data that describes the earth

2.2.5 Developments

One of the latest developments is the use of an object oriented database to store the geographic data. Oracle a.o.

delivers a database that enables the storage of spatial data. This is possible in the databases: Oracle database;

Standard Edition, Standard Edition One and Enterprise Edition. These databases have the standard Locater option. This is a feature that makes it possible to store and manipulate geo-data. Oracle Spatial is an option for the Oracle Database Enterprise Edition; this is an extension of the features provided by Oracle Locater. In general the extension contains more possibilities concerning the manipulation of data23.

This can be interesting for this research and for Vertis because many of their clients already have an Oracle database.

2.2.6 GIS as toolkit

Just as the BI-infrastructure can be seen as a toolkit to support the BI-cycle, GIS can be seen as toolkit to support f.e. the storage of geo-data. All GIS consist of features that together provide for the needed functionalities. An example: a feature that is very common in GIS is mapping services. These mapping services are the functionality that provides for mapping of the data; they make it possible to present a map on the screen. These mapping services can be seen as a tool that delivers this functionality. Other functionalities are providing for the storage of geo-data and editing functionalities.

All these functionalities can be seen as tools that together form a GIS. The functionalities/tools distinguished in this preliminary investigation are:

- visualization capacity - storage of geo-data types - editing functionality of geo-types - spatial analyses

Since the BI-infrastructure consists of tools, and a GIS consists of tools, would it not be interesting if both tools can work together? Then an even larger toolkit can be presented and more functionalities can be provided for customers.

2.3 The customer

The initial research problem implies that Vertis wants to become successful on the edge of GIS and BI- infrastructures. But to become successful in this area customers have to be convinced that this will add value to their business. Then the question arises; for who and when can GIS tools be interesting? In order to give an answer to this question a general idea of the customer is presented in § 1.3.1 at first. In § 1.3.2 something is explained about GIS-BI customers and §1.3.3 is dedicated to categorization models of BI-customers.

23 For a more detailed comparison of the two versions see Oracle.com, both whitepapers are available on the Internet

(22)

2.3.1 The customer in general

At Vertis most BI-customers, about 40%, are health institutions. The remaining 60% can be anyone from governmental institutions to insurance companies. What all these clients have in common is that they want an information system that provides the needed steering information. Traditionally this information is used by the managers (usually managers from all hierarchy levels) but especially in the health- world there seems to be a trend towards using the systems for all the organisational levels including non-managerial levels. This means that also the health workers will use the BI-infrastructure. Almost all these organisations would like to have standard reports as well as the ability to analyse the data.

In short the process goes like this: a client comes to Vertis with a question. This can be the question: we would like to have better reports, or better steering information. Or, we would like to have all our data in one source. This question can be initiated from two sides: IT or management. It can also be a need due to legislation.

Usually a scoping study is done, in which is determined what the client really wants and what the client already has considering the IT- infrastructure. The planning and control cycles are also investigated.

The solution delivered is an information system which makes it possible to perform ad hoc analysis but also deliver standard reports. At Vertis the vocabulary of the client is used, which refers to such a solution usually as a MIS. But because it isn’t (as explained in the previous section), in the thesis it will be called a BI-infrastructure.

The general results of a scoping study can be captured using the business intelligence maturity matrix (BIMM), see table 3. This matrix is developed by CIBIT and distinguishes 4 maturity levels. The BIMM can be used to classify an organizations ambition and current maturity level.

Ambition levels24 are:

1. Locally: to understand local affaires better

2. Coordinated; to coordinate matters better in order to become more efficient without changing current processes.

3. Integral; to innovate or change one or more processes in order to improve

4. Intelligent; to introduce complete new processes in and outside the organization as new business cases.

BI Maturity Matrix 1. Locally 2.Coordinated 3.Integral 4. Intelligent BI-ambitions “understanding” “improving” “optimalization” “innovate”

BI-organization Ad-hoc IT-management CIO CEO

BI-architecture Non IT-architecture Information

architecture

Real-time enterprise integration

Table 3: CIBIT Maturity Matrix

Currently, most customer Vertis serves are at a local or coordinated maturity level.

2.3.2 BI-GIS customers

Now a general image of BI-customers is presented, but what about BI-GIS customers? There are currently no users of such a BI-GIS combination described in available literature. Also the BI-GIS architecture developed by Vertis is not used sufficiently to gain information about the users. But when one wants to persuade customers that they need a certain product one has to be able to explain what the benefits are over their current solution. In this

24 Den Hamer, 2005

(23)

research, the product is a BI-infrastructure with GIS functionality. Current BI-customers are able to make reports and do analyses. What are the benefits of a GIS over the functionalities presented by their current BI- infrastructure?

To be able to explain to customers what GIS can do for them, it has to be clear how they currently use the BI- infrastructure and where GIS can benefit in their current usage of the BI-infrastructure. But there are no current users of a BI-GIS infrastructure that can provide that knowledge. This problem has to be encountered.

To overcome this problem a theory developed by Max Weber is used. This theory became known as the theory of

´ideal types´25. These ideal types are ‘theoretically conceived pure types of subjective meaning’ and meant to serve the researcher by lacking ambiguity. These ideal types are concepts that can be used to create hypothesis about social action in a world where no irrational factors influence rational behavior26. In this thesis ideal types are developed to create hypotheses about the benefits and use of GIS in a BI-cycle.

The ideal types will be developed using the information Vertis gained over the years about their users. Starting at the basis, what does a customer want from a BI-infrastructure? A customer wants his BI-cycle to be supported, which means that he wants his decision-making supported. But not every user wants the same functionality and wants to perform the same analysis.

Based on the experience of Vertis with BI it can be stated that there are 3 types of users.

1. users of management information reports

2. users that want to perform reactive management analyses 3. users that want to perform predictive management analyses A detailed ideal type explanation of these 3 types is presented in chapter 7.

2.4 Research implications

In the previous paragraphs, the basics of the topics of the thesis are briefly addressed. The decisions made in the next chapter will be based on the theory presented in chapter one. To ensure that it is completely understandable what consequences this theory has for the investigation the following points of interest are summarized:

• BI is about decision making

• The BI- infrastructure exists of a back and a front room that together make the BI-infrastructure; they have to fit to be useful.

• Both the BI-infrastructure and GIS’s can be seen as toolkits which provide technological support for decision making.

• Every company has geo-referenced data.

• The customer aspect of this research is taken into account by using ideal types. Three types are distinguished:

Users that want to have management information reports Users that want to perform reactive management analysis Users that want to perform predictive management analysis Hypotheses of the usage of GIS are established using these ideal types

The BI-infrastructure is: the IT-infrastructure (both software and hardware) that supports the BI cycle.

25 Weber M,1947, 5

26 Weber M, 1947, 92

(24)

3 Problem statement and research approach

In this chapter the problem is defined. After that the questions which will have to be answered to find a solution to the problem are stated.

3.1 Problem definition

The problem definition consists of a research objective and a research question27; both will be stated in this paragraph. The research objective answers the question: why is this research conducted? This question has to be answered because it is necessary to understand why the investigation is of importance to the one who ordered the investigation. This reason will have significant impact on research question and the research approach. And the research question states what the question is that has to be answered.

The management objective of this investigation is: to become successful in the use of GIS to support geo- referenced data in the BI process. This is the objective of Vertis. It is not the intention to obtain that goal within the months available to do this research.

Research objective

The goal of this investigation is to develop a document that provides Vertis with an introduction of the value of GIS into the world of BI, providing a theoretical explorative analysis of the importance of GIS for BI.

Research question

How can GIS support the three different BI-infrastructure ideal user types?

The research question does not completely cover the goal of Vertis, which is not possible due to the time restrictions. It will provide a basis and it will provide an introduction into GIS for Vertis employees. To become successful it is important to know the basics of GIS, to know where and when GIS can be used and to develop a business plan. This thesis will provide a basis; the basics of GIS and an introduction in to where GIS can be used are treated but a business plan can be a good idea for a new thesis.

It is important to explain the basics of GIS because the employees of Vertis, who develop the BI-infrastructures, must be able to understand and maybe even implement GIS tools. Thereby it is important to explain the basics to make them aware of the differences and the risks when a BI/GIS project is started.

3.2 Scientific relevance

There are several phenomena that together provide the academic relevance of this research:

1) The spatial aspect of data is under exposed

2) current research on spatiotemporal information systems focus on modelling of data instead of application possibilities

3) current research on geo-visualization focuses on the general benefits of mapping

Starting with the first statement: spatial aspect of data is underexploited28. When looking at data two general characteristics can be distinguished. Every fact, event, everything stored in a database is tied to the time

27 ‘t Hart e.a., 1998, 69

28 Gonzales M.L. , 2004

(25)

dimension. An event happened at some time in history and every fact is stored at some time in history. Besides the aspect of time, geography is another dimension shared by data. Almost everything in the world is tied to a place, whether it is the place where a good is delivered, stored or used.

Figure 7: Object: time and geography dimension

The dimension time is very important in the world of business intelligence: “without a time context the answers to many questions are meaningless.29” But also the geography dimension is important; “space is as natural to our thinking processes as time30.” At this moment the “where” element, the geographic dimension, is underexploited.

According to Rivest et al31, today’s commercial tools must be adapted or new ones must be built, in order to fully exploit the spatial component of data. They present SOLAP (spatial OLAP) technology this technology should merge business intelligence with geospatial technology.

Decision support systems developed from highly specific information systems to more generic applications. At first there were DSS, than ESS, than MIS. A BI infrastructure has, as explained in the preliminary investigation characteristics of all three. The main characteristic all these systems have in common are their support of decision making. GIS also supports decision making. Besides that, GIS is strong in spatial analysis and visualizing data. Why don’t these tools all work together? According to Gonzales current GIS typically developed for high-end customers in industries like telecommunications, utilities and government agencies. And the current problem of the integration of spatial data in decision support is implementation.

Bedard et al describe GIS as typically implemented as transactional systems and state that for analytical purposes alternative solutions must be used. One of these alternatives is combining GIS and OLAP into Spatial OLAP.

What these authors do not explain is how GIS can support decision-making in the business intelligence-cycle.

How can customers that have a BI-infrastructure, use GIS, to support their business intelligence-cycle?

When realizing the importance of geographic data (also called spatial data) in making decisions, the aspect of visualization immediately comes to mind. Visualizing geographic data is very powerful in the support of making decisions. The results of spatial analyses are best explained when visualized. 32

Limited attention is paid to this aspect. Rivest et al present a SOLAP tool that claims to support geo-visualization.

But they only mention this functionality and do not pay any attention to it.

29 Gonzales M.L., 1999

30 Gonzales M.L., 1999

31 Rivest S., et al. , 2005

32 Gonzales M.L., 2004

time Spatial/

geographical object

(26)

One might wonder if GIS is eliminated too quickly since the capacity to visualize spatial data is enormous. When looking at research about visualization this aspect is one of the main aspects why GIS could support business intelligence tremendously. Another aspect is the capacity to lay data layers over each other33.

Concluding, it is recognized in the academic world as well as the commercial world that the spatial component of data is under exploited. Both worlds are working for solutions to support exploitation of the spatial dimension. But everyone remains at their small research area. What lacks is a holistic view on the support that a GIS can provide for BI-customers with the customer as focus point.

3.3 Research approach

Based on the preliminary investigation a research question is formulated. This research question will be answered using both literature and the expertise of the employees of Vertis. But before explaining in detail how this investigation is accomplished the audience has to be identified. Next, the sub questions are determined and an explanation is given of how the answer to these questions will lead to an answer on the main question.

3.3.1 Audience

Besides the goal, the target group of readers is identified. This is ‘the general BI-er’. Based on the experiences at In-Control ‘the general BI-er’ has the following characteristics which influence the content of the paper. The general BI-er:

1. considers BI to be the BI-infrastructure 2. knows very little about GIS

This has the following consequences;

1. The difference between BI and the BI-infrastructure has to be explained. (See §2.2) 2. What GIS is and how it works has to be explained. (See Chapter 4)

3.3.2 Answering the main question: methodology The main question as defined is;

How can GIS support the three different BI-infrastructure ideal user types?

To get an answer to the research question and considering the goal and target reading group, the following questions have to be answered;

1. What can the GIS toolkit add to the BI toolkit?

a. What is GIS?

b. Why geo-visualization?

c. How can BI and GIS work together?

2. What can the GIS toolkit add for the different user groups?

a. What are the characteristics of the different user groups?

b. What are possibilities for the different user groups?

33 Menecke B.E., Crossland M.D., Killingsworth B.L., 2000

Referenties

GERELATEERDE DOCUMENTEN

For aided recall we found the same results, except that for this form of recall audio-only brand exposure was not found to be a significantly stronger determinant than

In conclusion, this thesis presented an interdisciplinary insight on the representation of women in politics through media. As already stated in the Introduction, this work

Gezien deze werken gepaard gaan met bodemverstorende activiteiten, werd door het Agentschap Onroerend Erfgoed een archeologische prospectie met ingreep in de

The effectual decision-making is positively and significantly affected by the inhibitory anx- iety of the entrepreneur. Both prospective anx- iety and intolerance of

The study area for the Hyperloop in the Netherlands be- tween Lelystad and Schiphol, criteria that are involved in this GIS - MCDA application, requirements of Hyperloop

Geldenhuys die praktiese operasionele riglyne, gebaseer op Botha se politi eke opdrag, so gestel: 'n SA W- onttrekking uit Angola moet so geskied dat dit Fapla nie bevoordeel nie;

rig in sekondere skole deur medium van Duits be- taal word, er heroorweeg sal word die lig van die Administrasie se vermoe om Duitssprekende kinders te

Cokayne, decided on 19 th October 2007, it was held that although if an employment tribunal claim which includes a claim for damages for breach of contract is withdrawn, the