A utilization rate analysis of import cluster Emden Oude Statenzijl.

(1)

A utilization rate analysis of import cluster Emden

Oude Statenzijl.

Half empty or half full?

Johan F. Verlinden

(2)

Abstract

Gas Transport Services BV is the Transport System Operator of the Dutch Gas Trans-port Network. It is confronted with investment decisions that concern possible expansions of its network capacity. Therefore knowledge about the utilization rate of the network is vital to assess the need for additional investments.

First we will attempt to model the utilization rate by tackling the problem bottom-up, that is, taking the behaviour of the individual shippers as a starting point and aggregating these results later on. It is assumed that these individual shippers make their decisions based on some unobservable factors. These factors will be exposed by performing a Time Series Factor Analysis based on the method proposed by Gilbert and Meijer (2005). This does not lead to a satisfactory model so a new hypothesis is formulated: All shippers behave independently of each other. This assumption is tested by performing a Monte Carlo simulation on the maximum cluster utilization rate. We find that in some years the simulation rejects the independence assumption in favour of the assumption of negatively correlated shipper behaviour.

We conclude that although we do not find the reason for their behaviour, we do find the result that in some years their behaviour causes cluster maximum utilization rate to be lower than it would have been under independent behaviour.1

1

(3)

Preface

March 3 I started my intership at Gas Transport Services, NV Nederlandse Gasunie. For four and a half months I was given the opportunity to write my thesis and at the same time experience the gas business at the Market Monitoring Department. Now that I have finished this project I would like to show my appreciation to everyone who supported me during those challenging though rewarding times.

In particular thanks goes to my parents who have always supported me during my studies in every way possible. They have been an indispensable factor for the successful completion of my studies for which I am very grateful.

I also want to give special mention to Adriaan de Bakker, my supervisor at Gas Trans-port Services. His incredible enthousiasm for the gas business has been a great inspiration and translated in extremely helpful brainstorms and useful feedback. His help has been of great importance for the realization of this thesis.

This thesis completes a five year period of learning for me, both inside and outside the classroom. I want to thank everyone who has been part of this journey for making this period an unforgettable part of my life.

(4)

1 Introduction

1.1 Outline

This thesis was written during my internship period at Gas Transport Services B.V. (GTS), one of Gasunie’s three divisions. GTS is the operator of the national gas transmission grid. Its task is to provide independent and high quality gas transport services to facilitate a proper functioning free gas market. GTS ensures that there is sufficient transport capacity and es-tablishes connections to other grids and networks.

This paper will start with the history of the Dutch Gasunie. A description of the envi-ronment that Gasunie is operating in is given next. After these sections, the focus of this paper will be outlined by discussing the problem formulation, which concerns the analysis of maximum utilization rates of transport capacity.

In the next chapter I will discuss the data. Constructing the right data involves a lot of choices which will be discussed here. A description and analysis of the data will be given afterwards.

A first step in testing the posed hypothesis is the development of the Time Series Factor Analysis Model. This framework will be applied to a variety of different datasets. Conclu-sions point towards the formulation of a new hypothesis that is tested in the section afterwards. This is done by performing a Monte Carlo simulation on the maximum cluster utilization rate and applying extreme value theory to the data. The conclusions from this analysis are used to shed light on earlier observations about the data. The paper finishes with a conclusion.

1.2 Gasunie and the liberalized market for natural gas

Gasunie was created in 1963 as a response to the discovery of the Groningen gas field four years earlier. Ownership was split between Shell/Exxon and the Dutch Government (which was represented by government owned DSM). At the time gas was not considered as a valu-able source of energy, partly because of the promising technology of nuclear energy. It was decided Dutch gas was mainly to be used for export. This policy changed when it became apparent that nuclear energy was not as promising as one had hoped for and the increasing dependency on fossil fuels from the Middle East was questioned during the oil crisis in the seventies. These developments suddenly made gas the centre of attention.

(7)

explo-rations in surrounding countries revealed large quantities of gas, predominantly in and around Norway, Russia and Algeria. It turned out that al these newly discovered fields had a calorific value different from the Groningen field gas. This is where the use of different qualities of gas started in the Dutch gas market.

Until the nineties Gasunie’s task consisted of supplying reasonably priced gas under a strict security of supply condition. Then the European Union started to grow stronger and for-mulate policies that demanded the European markets to integrate. The gas market was not going to be an exception. Fueled by positive experiences with market forces in the UK and US the European Commission liberalized the European gas industry.

However, the structure of the market does not make for an easy transition towards one Eu-ropean market. Formulated policy had competition, lower prices and positive environmental effects in mind. This turned out to be hard to achieve easily so targets had to be moder-ated and policies reformulmoder-ated in order to smoothen the path of liberalization. This made government partly regain the power it decided to give away, though now in a role of manag-ing competition on the natural gas market. Mik (2006) gives a useful discussion on the subject. In the Netherlands, it was decided that gas transport and trading should no longer be inte-grated in one organization. As a result, Gasunie was forced to split up in a trading division and a transport division. The split was finalized in 2005, when GasTerra officialy became an independent company. Ownership of GasTerra was decided to be similar as the original division in ownership. However, the new Gasunie now became a 100% public company. As a division of Gasunie, Gas Transport Services (GTS) provides the public service of gas transport. A liberalized gas market implies that all parties must have access to the transmission grid, without any limitations. An independent supervisor establishes conditions and tariffs, and tests them against the provisions of the Gas Act. In the Netherlands this responsibility has been assigned to a division of NMa called the Energy Chamber (Energie Kamer), former DTe. The Energy Chamber is an autonomous organization that regulates the gas and electricity sectors in the Netherlands. Liberalization creates a whole new set of challenges for GTS that will be discussed later on.

(8)

are the construction of the Balgzand Bacton Line to accomodate export to the British gas market and the acquisition of BBE, the gas transport network of North-West Germany. With these developments Gasunie aims to attract new gas flows, thereby competing with other European networks and thus embodying the newly created European free market.

1.3 Stakeholders

For GTS there is a paradox between on one hand, optimizing utilization of the given capacity and on the other hand, securing additional capacity to facilitate an efficient free market. This is confirmed in the contrasting targets that the Energy Chamber sets. Their ’security of sup-ply’ target would suggest additional investments whereas ’low consumer prices’ discourages additional investments. On the other hand one could argue that additional investment, more capacity and therefore a secure supply prevents price peaks and therefore also lowers prices. This study will not take stand in this discussion. GTS will have to decide how optimality in gas transport is defined.

The interesting aspect in this market is that the transport market is facilitating for the gas market. Since the price of transport is only a fraction of the final price of gas paid by consumers, this creates friction between optimizing the gas market and the market for gas transport. This is reflected in the different stakeholders posing different viewpoints. The five largest stakeholders in the gas transport market are GTS, Ministry of Economic Affairs, Energy Chamber, EU and the consumers (shippers). Different stakeholders have different, sometimes conflicting, goals in mind which we will outline here:

• GTS has the legal target to provide sufficient gas transport under economic conditions. Sufficient implies that enough capacity is available to facilitate an efficient free gas market. On the other hand, the target on economic conditions makes sure that GTS, like any other profit maximizing company, properly evaluates its business cases. • The Ministry of Economic Affairs is mainly concerned about the security of supply. A

thorough assessment on Europe’s security of supply issues is given in ’Europe’s Vulner-ability to Energy Crises’ by the World Energy council (2008). A lack of investment in gas infrastructure and storage was determined to be one of the key threats to security of supply.

(9)

• The European Union pushes for one competitive gas market resulting in competitive gas prices. Theoretically prices at different locations should never differ more than the transport costs between those two locations. Since the grid poses severe limitations on transport this theoretical result is far from the observed situation. A promising theory on how to test this econometrically is given by Marmer, Shapiro and MacAvoy (2007), using cointegration of spot gas prices to determine regional dependence.

• Finally, the consumers (shippers) want low consumer prices. This is most likely to be attained when transport capacity is abundant and the free market is fully functioning. Witteloostuijn (2007) explores the ramifications of the Energy Chamber’s policy which is a policy of consistently lowering gas transport tariffs. He argues that, because of imperfect competition, not all rents are passed on to the end-consumers. In addition he finds that a substantial part of the rents are exported. Therefore, much of the gains of lower transport tariffs are lost.

The Transmission System Operator’s challenge is to formulate policy taking all these view-points into account. In the next section we will give an overview of the structure of the GTS transport system.

1.4 Entry-exit system

GTS has implemented an entry-exit system, in which the gas enters the national gas trans-mission grid at entry points and leaves the grid at exit points. Entry capacity booked at a specific point gives the right to inject a specific volume of gas per hour into the grid at that specific entry point. Booked exit capacity gives the right to extract a specific volume of gas per hour from the grid at a certain exit point.

(10)

GTS decided to sell interruptible capacity which is capacity that is offered to the market when firm capacity is not fully used. The capacity that was already offered was called firm and is available with 100% certainty since it corresponds to the physical availability of ca-pacity. The new capacity was called interruptible and has a slight chance of being interrupted. So where does the capacity that is not physically available (interruptible capacity) come from? Basically there are two main causes for interruptible capacity to become available. The first cause is other shippers not nominating their firm capacity. If shippers that buy firm capacity only nominate part of it, the other part can be used as interruptible capacity. In this way it is ensured that physical capacity is not lost because of low utilization of firm contrac-tual capacity. The other cause for interruptible capacity to become available is contraccontrac-tual gas flows in the other direction. If shippers nominate gas flows in the export direction then the same amount of import capacity becomes available.

The reason that this capacity is called interruptible is that it cannot be guaranteed that extra capacity will become available. It becomes available only when shippers nominate less than their bookings or nominate export flows. The capacity that is sold on an interruptible basis can only be confirmed when the behaviour of other shippers creates free space in the capacity. Since these behaviours are predictable this capacity is still sold, though with a slight chance of interruption.

The chance for interruption is published as far as it can be predicted based on past per-formances, so that shippers can make a rational decision whether this capacity is of any value to them. Obviously this capacity is substantially cheaper to book. Interruptible capacity will not be sold until all firm capacity is sold out.

The complete process from start to finish can be described as follows. A shipper books entry and exit capacity in advance. By the time the shipper knows the amount of gas it wants to transport it nominates this amount, which can be less but no more than it has booked capacity. GTS will then, if conditions are satisfied, confirm this nomination. The shipper is then allowed to inject this amount of gas at the nominated entry point and extract it from the nominated exit point. Afterwards the allocated quantity is determined based on the real physical flows and the nominations.

(11)

energy value of the gas at any particular stage is measured by the Wobbe-index. Groningen gas is characterized by a low Wobbe-index. This means that it can be necessary to lower the quality of the gas that is imported from other fields. This process is called quality conversion and is done by blending different gas qualities or injecting nitrogen in the high calorific gas. At the time of writing quality conversion can be booked at GTS. However, in the future this service will be included in the gas transport services. Every shipper will be able to have the quality of its gas converted without having to pay an extra fee. The costs of this service will be spread over all other tariffs.

1.5 Problem formulation

When the entry-exit system was implemented in 2003 and GasTerra was split off in 2005, Gas Transport Services lost its information about the size and direction of the major gas flows. Shippers can buy entry and exit capacity at different network points in the grid and then postpone the decision of which network points to use to a later point in time. This makes it hard to predict utilization rates at particular network points. Since GTS policy has always aimed at providing full certainty about the availability of transport capacity GTS uses worst case scenarios to determine what capacity is required.

To regain some of the information about future gas flows GTS decided to do shipper inquiries at a regular basis. These inquiries are called ’Open Season’ and determine the capacity that is demanded in the future and therefore will be built. Building additional capacity is a typical discontinuous process so investment decisions have to be made well in advance and with a long term perspective. These ’Open Seasons’ are important since in a liberalized and uncertain market asymmetric information will cause underinvestment which puts the security of supply target at risk. A number options are at disposal in the occurence of capacity shortage. One example would be activating storage capacity to make up for temporary shortages.

(12)

Overbooking gives shippers more flexibility in deciding upon their entry and exit points after their initial booking. This is called optional value. The option of making a last moment decision can be of great value. Entry flexibility can be valuable when production is failing at some point or when prices from different sources differ. By having flexibility shippers can optimize their portfolio by choosing the most profitable entry point. Also exit flexibility has value, for example when price differences make switching export country profitable. This flex-ibility causes a low utilization rate which can be regarded suboptimal in terms of transport efficiency. However, from the perspective of the gas market low utilization rate can be optimal. An interesting application of this is called the Jepma (2001) effect and is caused by the growing internationalization of the gas market. The Jepma effect describes the possibility of foreign gas flows entering the Dutch transmission grid for transit purposes, caused by differ-ences in transport tariffs. The other way around, Dutch gas transiting through Germany, is another possibility. Because of the unpredictability of this effect, investment decisions are complicated by this effect’s inherent uncertainty. If TSO’s on both sides of the border antic-ipate this effect a low utilization rate on one of both sides will be the result.

Another argument for overbooking might be hoarding. This is overbooking on purpose to tighten the market and put other shippers out of business. This will become more relevant when lots of smaller shippers enter the market. The extent to which this is happening is hard to determine. On the other hand, when the market is tight, shippers also hold on to capacity they do not really need since it might be difficult to reclaim this capacity later. This happens when a shipper’s market share diminishes and transport tariffs are low enough to hold on to their capacity, even though they are not planning to use it.

Finally, neglecting market information and therefore making suboptimal decisions might re-sult in overbooking and therefore low utilization.

One can summarize this by stating that the market has become increasingly dynamic with the liberalization as more and more trade occurs, within countries and cross border. Also the shorter contract duration increases volatility. This volatility prevents a steady and pre-dictable utilization rate, like it was before the liberalization when the system was still tailor made, from happening.

(13)

Therefore the focus of this paper can be summarized in the following research question: Can we get an understanding of shipper behaviour that enables us to explain the maximum cluster utilization rate?

It is assumed that every shipper behaves rational in some way. Though from the system’s perspective the utilization rate is far from optimal. The rationale behind this phenomenon is what motivates this study.

What this study will do is gain insight in the behaviour of shippers and therefore the likeli-hood of these shippers simultaneously increasing their demand for capacity, thereby peaking utilization. This study will not address questions on optimality or efficiency. Both are busi-ness concepts that will have to be defined by the Transmission System Operator.

(14)

2 Data

2.1 Description

The transmission grid is the core asset of Gasunie. This grid is only useful if sufficiently connected to external parties. These are responsible for supply and demand of the gas. The gas is injected in the grid from a variety of sources.

National production makes up for a large share of entry capacity. Most of this is produced in the North Sea. Another example of an entry source are the underground storages. In addition, there is the Groningen gas field, another production field and the largest in the region. Finally, import already makes up for a substantial share of the supply of gas that the GTS grid is fed with.

Exit capacity can also be categorized, where the obvious categorization is industry, power stations, households, storage and export. Industries, power stations and households are all spread across the country. Storages have been build, mainly on old gas fields which are lo-cated in the north of the Netherlands. On the contrary, export is mostly taking place in the south of the Netherlands. Only 2 years ago the BBL was constructed to serve the British market, making it an important export point in the north west.

It is expected that by 2015 30% of the local production is replaced by import from foreign countries. This is caused by an overall decline in production, not only in the Netherlands but all across Europe. This makes import an increasingly important supply factor. At the moment most of the gas is imported at the cluster Emden Ouden Statenzijl (OSZ) in the North East of the province Groningen.

It is found that Emden OSZ is not only by far the largest import cluster, it is also known to be a cluster where import capacity is scarce and utilization rate is a much debated topic. Therefore it is decided that the analysis of this paper will focus on import in the Emden OSZ cluster. Narrowing down the scope gives us a better tractable problem. In addition, results that are found for this cluster might be transferable to other points in the network.

The Emden OSZ region consists of six network points suitable for importing H-gas. Since G-gas is predominantly produced locally we choose to analyze the import of H-gas as this is the prevalent gas quality for import purposes. The network points are:

• Emden NPT • Emden EPT

(15)

• Oude Statenzijl Ruhrgas-H • Oude Statenzijl BEB-H • Oude Statenzijl D-gas and are presented in Figure 1.

Figure 1: A graphical represenation of the Emden Oude Statenzijl Cluster

The pipelines from all six networkpoints converge towards the Noordbroek-Ommen (NO) pipeline. When summing up the different networkpoints’ capacity this sum greatly exceeds the capacity in the NO pipeline. NO is the bottleneck and therefore the grid point that we choose to analyze.

This has been one of the major issues for data collection. Data for the different network-points is readily available but data for the NO pipeline is only available on aggregate, not per shipper. This makes construction of the dataset a cumbersome task which involves a lot of choices. The data has to be constructed from the data of the different networkpoints by aggregating it, but only aggregating those allocations and capacities that actually use the cluster bottleneck (NO pipeline).

(16)

enter the cluster at some network point, stay in the cluster and exit the cluster at another network point. As a result wheelings do not put any pressure on the cluster capacity. Another type of special contract is ’short haul’ contracts. These are agreements where gas enters the cluster but exits the cluster before it enters the bottleneck. An example of this in the Emden OSZ cluster is the contract that serves the power station at Eemshaven.

Graphically the cluster is modeled as in Figure 2. It shows how the import of H-gas in the six networkpoints is aggregated and then split up in a gas flow entering the NO pipeline and one for the other directions.

Figure 2: The model version of the Emden Oude Statenzijl Cluster

After the right set of inclusions and exclusions has been determined for the cluster over a particular period of time the data can be produced. This is the time to make some more choices on which type of capacity we are looking at. Since GTS started to sell interruptible capacity in addition to firm capacity there is a distinction between the two datasets.

(17)

Possible definitions of the utilization rate would be dividing firm allocations on firm usage rights, or dividing the sum of interruptible and firm allocations on the sum of interruptible and firm usage rights. Another possibility would be dividing firm and interruptible allocations on only firm usage rights. The advantage of this last definition is that it captures all alloca-tions but still uses firm usage rights which corresponds to physical capacity. Unfortunately this definition breaks down when analyzed at shipper level. Some shippers have no firm usage rights but do have interruptible usage rights. When they allocate on their interruptible usage rights their theoretical utilization rate would be undefined.

Choosing between the first two defitions, we choose the first definition with only firm al-locations and usage rights instead of the sum of both firm and interruptible. Interruptible allocations and usage rights can be defined in many ways. There is no fixed amount of in-terruptible capacity available as this can fluctuate based on the amount of import that is not nominated and the size of the export flows. We find that interpreting results in terms of interruptible capacity makes our conclusions unnecessarily complicated. Still it should be kept in mind that excluding interruptible allocations generally excludes about 20% of the allocations which is a significant part of the allocations.

Another choice is the timeframe we are dealing with. GTS and its balancing system is based on hourly data. This makes hourly data the obvious choice for a time frame. Still, for some applications it will turn out that we have to aggregate the data over a longer time span. Those datasets are constructed by taking the average over hourly data. This has the disadvantage of not taking into account the peak in these time spans, even though the peak is what interests us most. However, the alternative, taking the maximum of the hourly data in that time span, is an unfair comparison since these peaks occur at different moments in that time span for different shippers and therefore overestimate the aggregate peak.

Now that we have data per shipper for allocations and usage rights, both per hour, for the Emden OSZ cluster we are ready to commence the analysis. We determine utilization rate by dividing the firm allocations by firm usage rights. It should be kept in mind that this utilization rate does not give us any information about physical utilization or physical capacity shortages. Utilization rate is a contractual concept. Physical utilization rate might be higher because of the use of interruptible capacity, or it might be lower because of con-tractual export occuring in the same cluster.

(18)

diffi-culties in interpretation with utilization rates over 100%.

So eventually we have composed a database with data from 2003 up to 2007 for the Em-den OSZ cluster bottleneck. The data consists of shipper allocations and usage rights, both under firm conditions and on an hourly basis. The ratio of both numbers we call utilization rate. A further discussion on this and a presentation of the data for the different years is given next.

2.2 Analysis for 2007

For definitions and mathematical notations that are used in the following treatment we refer to Appendix 6.1.1.

The data for 2007 shows the presence of 20 active shippers at the import cluster Emden OSZ. The five largest shippers together account for 76% of the market. This is measured by considering average usage rights S_iaover the year, without looking at their utilization of their usage rights. Most shippers fully utilize their usage rights at some point in the year, or else get close to full utilization.

(19)

A graphical representation of the cluster allocations Atand cluster usage rights Ztis given

in Figure 3. The 8760 hours in the year produce 8760 datapoints. The cluster capacity in 2007 was between 2.9 and 3.0 million m3 per hour. The maximum point is attained Septem-ber 11 between 13h and 14h, corresponding to a 82% utilization rate. It can be seen that in summer the utilization rate is somewhat higher than in winter. Although this might seem counter intuitive this is a result of large import and export contracts that are in place in this cluster. There is a flat import pattern balanced with export contracts. These exports are seasonal with export higher in winter than in summer. This results in a net lower import in winter and therefore a net higher import in summer. Another observation that one can make is that a lot of volatility is observed.

Using the definition for total cluster utilization rate Yt this results in a utilization rate as

presented in Figure 4.

Figure 4: Total cluster utilization rate 2007

2.2.1 Load Duration Curve

(20)

utilization rate as the variable of interest. The Total Cluster LDC for utilization rate L is defined as Y sorted in a descending order.

we present this total cluster utilization rate over 2007 in Figure 5. For the Transport System Operator particularly the first part of the graph is of interest. This shows the distribution of the utilization in the busiest periods. It is found that the highest utilization during the year 2007 was 82%, which is the starting and highest point of the LDC. Then it smoothly goes down towards the lower utilization rates. This smoothness points towards a market with no obvious structural breaks.

Figure 5: The Total Cluster LDC for utilization rate over 2007

(21)

Figure 6: Load Duration Curve of utilization rate including interruptible allocations over 2007

The LDC is the starting point of every utilization rate analysis. Next section will compare this LDC with other definitions of this curve.

2.2.2 The Unsimultaneity Index

There is another type of LDC that we can construct. We start by constructing individual LDC’s for individual shippers. This means taking the utilization rates of individual shippers y_i and sort them in descending order y↓_i. This new vector is defined as l_i.

From this we could add all these different individual shipper LDC’s to get an overall new LDC. We call this the worst case LDC since it implicitly assumes that all the highest shipper utilization rates occur simultaneously. It is the Load Duration Curve that would be found if, given this data, all shippers peak their utilization rate simultaneously instead of with the observed unsimultaneous peaks.

(22)

over the year. The resulting LDC is constructed as follows LW_ta= N X i=1 S_ia∗ lit (1) for t = 1...T, and so LWa= (LW₁a, LW₂a, ...LW_Ta) (2) The result is that we get the worst case LDC where the first entry is the maximum uti-lization rate of all individual shippers weighted by average market share. The second entry consist second highest utilization rate of all individual shippers weighted by average market share and so on.

The point of interest is the starting point of this LDC, the scenario that would occur if all shippers peak at the same time. The value in the construction of this worst case LDC can be found in the difference between its peak and the peak of the cluster LDC. We call this the ’Unsimultaneity Index’.

However, by deciding to use average market shares over the year we implicitly assumed that market shares are constant over the year. This is far from the truth in this case. A signifi-cant group of large shippers vary their usage rights over the year, which makes weighting by average market share not the straightforward choice.

This is because there is a distinction between the original total cluster utilization rate and the by average market share weighted total cluster utilization rate. This is caused by changing usage rights over time. The total cluster utilization rate is defined as:

Yt= At Zt = T X t=1 N X i=1 yit∗ si (3)

and its maximum is Ymax or L1. L is constructed by sorting Y in descending order. The by

average market share weighted total cluster utilization rate is defined as:

X_ta=

N

X

i=1

yitSia (4)

with its maximum X_maxa or LS₁a. LSais constructed by sorting Xain descending order. With constant usage rights and market shares X_ta equals Yt but with market shares changing over

time Xt6= Yt.

(23)

calculating Xt we implicitly assume constant market shares. For analyses over a year this is

far from the actual situation. In general LSacan be higher as well as lower than L. However, the persistent behavioural pattern of the shippers makes this method estimate a lower LWa than L. This is caused by high utilization rates in periods that shippers also have a higher market share sit than their average market share Sia.

This phenomenon is known as the Simpson paradox since this effect was described by Simp-son (1951). The effect had been described earlier already (PearSimp-son, 1899) but the name of Simpson was popularized by Blyth (1972). An example of its applications is given in Wagner (1982). I will give a short example here to illustrate the effect. In a most simplified world there are two shippers and two periods. The following data poses the situation, Figure 7 shows the problem graphically.

z11= 40 a11= 10 y11= 25%

z12= 80 a12= 60 y12= 75%

z21= 80 a21= 60 y21= 75%

z22= 40 a22= 10 y22= 25%

Calculating the cluster utilization rate, we find the same Yt in both periods:

Yt=

a11+ a21

z11+ z21

= 10 + 60

40 + 80 = 58% (5)

Obviously the average market shares of both shippers equal 50%. Now we are equipped to calculate the by average market share weighted cluster utilization rate:

Xt= S1∗ y11+ S2∗ y12= 0.5 ∗ 0.25 + 0.5 ∗ 0.75 = 50% (6)

(24)

Figure 7: Simpson example

When we look at the actual data we find that there is a severe downward bias in a weighted market share approach.

(25)

Figure 8: Actual LDC and two average weighted LDC’s

We are tempted now to give an interpretation to the difference between the start of the original LDC and the Worst Case LDC, that is LW₁a− L₁. However, these LDC’s are not con-structed based on the same data so comparing their maximum would not be a fair comparison and give meaningless results. The comparison that we do have to make is the one between the by average market share weighted cluster utilization rate LDC and the by average market share weighted worst case LDC, that is LW₁a− LSa

1. These are based on the same data so a

meaningful interpretation can be given.

(26)

However, since we are primarily interested in the peak cluster utilization rate another weight-ing method seems appropriate. That is, if we decide to weight not by average market share over the year but by market shares at the time of cluster peak utilization then the peak of the weighted utilization rate LDC and the actual cluster utilization rate LDC will coincide. At least for the peak, this solves the confusion of having two different measures for cluster utilization rate.

The consequence of this is that the worst case LDC will also have to be weighted by the market shares at the time of peak utilization rate. By using the same market shares for the construction of both LDC’s a fair comparison can be made and an interpretation can be given to the differences of these peaks. For this we still need to assume that market shares are constant over the year, in this case fixed at the market shares attained at peak cluster utilization rate.

Mathematically this translates in the following definitions. The by peak market share weighted total cluster utilization rate is defined as

X_tp =

N

X

i=1

yit∗ S_ip (7)

for t = 1...T, where sorting Xp in descending order gives LSp. The by peak market share weighted worst case LDC is constructed by

LW_tp = N X i=1 S_ip∗ l_it (8) for t = 1...T, and so LWp = (LW₁p, LW₂p, ...LW_Tp) (9) New graphs can be drawn to show the LDC’s constructed based on these peak weighted market shares (Figure 9). Indeed we find that the LDC’s for the total cluster utilization rate coincide for the first observation, that is

LS₁p= N X i=1 yp_i ∗ S_ip = A p Zp = L1 (10)

(27)

Figure 9: Actual LDC and two peak weighted LDC’s

This Unsimultaneity Index (UI) was calculated by taking the difference from the observed maximum utilization rate and the worst case scenario. That is, for one particular year we calculate:

U I = LW₁p− L1 (11)

For the 2007 data the Unsimultaneity Index is found to be 90, 2% − 82, 3% = 7, 9%. It mea-sures the percentage of extra utilized capacity that would have been demanded if complete simultaneity of peak utilization would have occured. Or stated differently, the capacity that is not utilized because of unsimultaneity occuring.

The rationale behind the UI is that it measures the percentage of capacity that is lost because of unsimultaneity, in contrast to the capacity that is lost because of the fact that is has not been used. We define the latter capacity as dormant capacity:

DC = 100% − LW₁p (12)

(28)

unsimul-GTS insight in the relative causes of low utilization of the grid. It must be stated that both unsimultaneity and dormant capacity are characteristics of a well functioning gas market. Low utilization of gas transport capacity therefore does not imply overall inefficiency. The next section will further explore this issue for the years before 2007 and relate these numbers to market concentration.

2.3 Earlier years and market concentration

The previous section discussed the construction of different Load Duration Curves and the resulting Unsimultaneity Index. Replicating these calculations for earlier years is the next step in this analysis.

Utilization rates and their corresponding LDC’s for 2003 up to 2006 are presented in Ap-pendix 6.1.2. Based on these LDC’s we again calculate Unsimultaneity Indices for these years. These are presented in Table 1.

It has been suggested that there might be a correlation between unsimultaneity and the market concentration. The argument is that in a monopolistic market the market is more predictable for the shippers so less overbooking is needed. On the other hand, in a competitive market shippers need a lot of flexibility because of competition and unpredictability. This possibility will be explored by performing this analysis for a number of years and determin-ing the relation between the number of shippers and the inefficiency in the total maximum utilization rate of the cluster over these years.

The hypothesis that unsimultaneity would increase with the increase of the number of shippers is made more specific below:

There is a negative correlation between the Unsimultaneity Index and the Herfindahl-Hirschman Index

The construction of the Unsimultaneity Index has been discussed in Section 2.2.2.

(29)

Table 1: Summary of data for unsimultaneity and market concentration Year 2003 2004 2005 2006 2007 LS1 81% 78% 88% 82% 82% LW1 92% 83% 98% 94% 90% DC 8% 17% 2% 6% 10% UI 11% 5% 10% 13% 8% HHI 3227 3405 3073 2083 2063

10000, that is for each year we calculate

HHI = 10000 ∗

N

X

i

S_ia2 (13)

In this instance we choose to use average usage rights instead of peak usage rights because this gives a better approximation of the market shares of the different shippers over the year. With only 5 years of relevant data available this gives us only 5 datapoints. This makes a proper statistical analysis meaningless so we confine ourselves to an exploratory analysis.

(30)

Figure 10: HHI p UI

(31)

3 Time Series Factor Analysis of Shipper Behaviour

3.1 Introduction

The previous section discussed the concept of Load Duration Curves and the Unsimultaneity Index. The Unsimultaneity Index gives us some insight in the consequences of shipper be-haviour for the cluster utilization rate. This section will focus directly on shipper bebe-haviour by building a model for the individual shipper data. Section 3.2 will discuss how we came up with our selection of the appropriate model. Section 3.3 discusses the model itself. The results are presented in Section 3.4 with the conclusion being presented in Section 3.5.

3.2 Model selection

To gain more insight in the behaviour of shippers as to explain their maximum utilization rate I decide to search for commonalities between the shippers. One would instantly think of calculating correlations between shippers. Though, since we are dealing with time series this might result in problems like spurious regression. Also estimating a VAR model would come to mind. VAR models are treated extensively in Lutkepohl (1991) .

Some specifications of VAR models were estimated, both for hourly data and weekly data. However, the reliability of the models was questionable since R2 values turned out to be low and most parameter estimates not significant. For hourly data this might have been caused by low volatility. Differenced data then contains a lot of zero values. Differencing was needed because of the presence of unit roots.

Also weekly data suffered from these flaws. Although some higher R2 values were observed the parameter estimates were still insignificant because of a high standard errors on these estimates. When looking at the impulse responses, most of them are flat which indicates that shippers are not responding to each other.

A more promising path would be to look for exogenous variables that shippers might re-spond to. One could think of temperature, prices, production failures and so on. However, it is found that exogenous variables that explain shipper behaviour are not identified that easily. Temperatures mainly change the demand for G-gas, not so much the import of H-gas. A variable production failure would be shipper dependent and therefore not a relevant vari-able for all shippers. It is determined that there is a lack of knowledge about which varivari-ables influence the utilization rate.

(32)

identify independent variables such that when used in a regression for the dependent vari-ables those independent varivari-ables can explain the observed data as accurate as possible. An overview of the theory and applications of factor analysis is found in Wansbeek and Meijer (2000).

I choose to estimate a factor model because I want to investigate what the underlying reasons are for the shippers to behave the way they do. There are no obvious exogenous variables that explain their behaviour so a factor model might expose some hidden factors.

With this modelling technique we can answer a subquestion of the original research ques-tion.

How much of the observed shipper behaviour can be explained by identifying com-mon factors for these shippers?

When applying the standard Factor Analysis model to the data we notice one major deviation from the necessary assumptions. The data we are dealing with are time series.

Factor Analysis theory assumes observations are independent and identically distributed. Time series data are typically autocorrelated and therefore not independent. Dynamic Factor Analysis (DFA) was developed to address these differences. However, a drawback of DFA is that a model of the factors must be specified in advance. As a result, parameter estimates and predictions will depend heavily on the specified dynamic factor model. Since we do not want to impose any restrictions on our model without having some certainty about the validity of these restictions we choose not to apply DFA. A good treatment of the recent developments around Dynamic Factor Models is given in Breitung and Eickmeijer (2005).

A model that does not have this drawback is the Time Series Factor Analysis model de-veloped by Gilbert and Meijer (2005). This model does not make assumptions about the specification of the estimated factors which reduces the chance of biased results because of questionable assumptions. Originally the model was used for time series data on monetary aggregates. This financial data is more volatile then the data we are dealing with here. Still we have confidence in the usefulness of this approach, which is reinforced by the following statement:

”The estimation methodology should be applicable to a much larger range of prob-lems than the application considered here, and indications are that it can work very well.” (Gilbert and Meijer, 2005)

(33)

3.3 Model description

The k unobserved processes of interest (the factors) for a sample of T time periods will be indicated by ξit, t = 1, ... T, i = 1, ... , k. The M observed processes, shipper utilization

rates, will be denoted, like before, by yit t = 1, ..., T, i = 1, ..., M. The factors and indicators

for period t are collected in the vectors ξ_tand y_t. It is assumed there is a measurement model relating the indicators to the factors given by

y

t= α + Bξt+ t (14)

where α is an M-vector of intercept parameters, B is an M × k matrix parameter of factor loadings, and _t is a random M-vector of measurement errors. B can be estimated using conventional FA estimators (uses sample covariance Ψ). In a more general case the intercept parameter could also be time variant denoted by α_t

Gilbert and Meijer discuss five conditions that are to be satisfied for the parameter esti-mates to be consistent. It follows that data does not need to be stationary as a weaker form of boundedness suffices. Nevertheless, estimating a differenced factor model is a possibility. The estimated factor loadings matrix in the differenced model is similar to the estimated factor loadings matrix in the undifferenced model, only the factors differ. I have tried both differenced and undifferenced data and based on the prediction results I have chosen to esti-mate the models for differenced data.

The factor score predictors that are most frequently used are the Bartlett predictor and the regression predictor. The regression predictor requires knowledge about the mean and covariance of ξ_t whereas the Bartlett predictor does not. Therefore we decide to use the latter. Assuming α = 0 the Bartlett predictor for predicting factor scores ξ_t is

ˆ

ξB_t = (B0Ψ−1_t B)−1B0Ψ−1_t y_t (15) and the predicted values of indicators can be computed by:

ˆ

y_t= ˆB ˆξ_t (16)

Adding an error term _twould open up the possibility for a simulation study by generating a collection of error term series and calculating confidence intervals. A more detailed discussion of the model can be found in Gilbert and Meijer (2005).

(34)

If we can identify a number of factors it would be interesting to see if we can explain these factors by exogenous variables. If these factors have some pattern that might be explained by other variables this is worth further investigation. This can be tested by running some additional regressions with these factors as independent variables.

By analyzing factor loadings one could cluster the shippers by their dominant factors. If a clear clustering emerges from the analysis we could group shippers according to this cluster-ing. Identifying groups of shippers that exhibit the same behavioural pattern would be useful in predicting changes when new shippers of some type enter the market. With this clustering we can explain how a change in the number or type of shipper changes overall utilization rates. If the model accuracy is convincing, we could make short term and long term predictions based on the model. Short term predictions can be useful in offering extra capacity to the market in short term. Long term predictions can help decide if it is needed to invest in extra capacity.

Computations are done by the statistical program R. Code was written by Gilbert and Meijer and made it publicly available in the package TSFA. The code we have used is presented in Appendix 6.3.1. For the availability of R and the packages we refer to the bibliography .

3.4 Results

3.4.1 Daily data over 2007 for the 8 biggest shippers

For the construction of a model two choices have to be made. The first choice is which data input to use for the analysis. The choice for hourly, daily or monthly data will have significant effect on the interpretation of the model results. Monthly data will have long term interpreta-tional value, whereas hourly data will be of more value for making short term interpretations. The methodology has been applied to a range of different datasets. Five of those are:

• Hourly data over 2007, allocations • Hourly data over 2007, utilization rate • Daily data over 2007, utilization rate

• Hourly data over January 2007, utilization rate • Monthly data over 2003 up till 2007, utilization rate

(35)

The second choice that has to be made is the number of factors to include in the model. To decide on this we can use either a χ2 test or the Akaike Information Criterion. In most cases both tests disagree with each other.

After performing the TSFA on all possible datasets we found some recurring results. No estimations identify a model with a low number of factors and high explanatory value. To show this we present one model that illustrates this effect.

We estimate a model with daily data over 2007 for the 8 biggest shippers. The model is estimated for an arbitrary number of factors. For every different number of factors statistics are produced i.e. the Akaike Information Criterion and p-values based on the χ2 test. As those statistics mostly differed we decided to choose the model based on the AIC. The AIC chooses 3 factors. This is confirmed by the scree plot (Figure 11) which shows three eigenval-ues bigger than 1. This is only a rule of thumb but since it confirms the AIC we are reassured by it.

Figure 11: Scree plot of the eigenvalues that correspond to the extracted factors.

(36)

par-In Figure 12 we present a 2-dimensional plot of the first two factor loadings. The factors and factor loadings are all standardized to shipper 1. This is the plot where we would have expected a clustering to occur. Any group of loadings close to each other would suggest a similar behavioural pattern. We do not find any obvious clusterings in this plot.

Figure 12: 2 dimensional plot of factor loadings from a three factor model on daily data in 2007. Every dot represents one shipper.

In advance, we hoped to be able to assign the shippers to different groups, with the help of this particular graph. Shippers that react similarly to the same factors have common be-havioural patterns. However, after closely inspecting Figure 12 we see no obvious groupings occur. This is a somewhat disappointing result.

(37)

Then we have a look at the predictions for shipper X. In Figure 14 we see that the pre-dictions and the actual values are not even remotely following a similar pattern. Apparently the model has assigned more importance to resembling shipper Y than resembling shipper X.

(38)

Figure 14: Actual and predicted utilization rates of shipper X over 2007, based on the 3-factor model.

This result is disturbing as it shows that, because there are no common patterns in ship-per behaviour, the model estimates factors close or equal to some shipship-per’s utilization rate. This is the result that we found across all different analyses. This indicates that our model is seriously flawed.

A low dimensional factor model with high explanatory value is not found in either of the modeled situations. Not in the long term (monthly data from 2003 to 2007) nor in the short term (hourly data for January 2007). Adopting a model with many factors is often suggested by the AIC but this is not what we are looking for. These models extract factors that closely resemble some important shippers, not external factors.

Since this estimation method does not provide us with a satisfactory model we do not con-tinue the analysis using this model. Clustering or estimating dynamic models for the extracted factors of these flawed models will not be helpful.

3.5 Conclusion

(39)

(40)

4 Monte Carlo Simulation and the Extreme Value

Distribu-tion

4.1 Introduction

Last section discussed a Time Series Factor Analysis on the shipper utilization rate data over 2007. This TSFA has not produced results that indicate any meaningful correlation with external factors. Inspecting the utilization rate behaviour of the individual shippers confirms the view of shippers acting independently. Though before jumping to conclusions we would like to confirm this view by posing a new hypothesis.

Even though some correlation between shippers is observed, these correlations do not affect the maximum total utilization rate. Stated differently, for the purpose of evaluating the maximum total utilization rate we can assume that shippers behave totally independent of each other.

The next step would be to determine a way to test this hypothesis.

4.2 Model selection

To determine an appropriate way of testing this hypothesis we consider the assumption of ’independence’. This implies that whatever the other shippers are doing does not affect the behaviour of one particular shipper. Since the shipper data are time series, regressing two shippers would result in spurious regression which does not tell us anything about indepen-dence. Differencing does also not give satisfactory results as most dataseries are relatively steady on an hourly basis.

However, we are actually interested in the maximum utilization rate, not so much in the behaviour of the rest of the data. Two different methods could be applied to this situation. One is the Peak over Threshold method. This method selects all maximum values of a time series that exceed a certain threshold. The advantage of this method is that by selecting an appropriate threshold one can calibrate the number of datapoints that are available. A lower threshold value results in more data exceeding the threshold value which generates more dat-apoints. The disadvantage of this method is that it assumes a similar data generating process over the whole period. Since our data series contains 5 years in which the gas transport market has undergone some significant changes this does not apply to our data set.

(41)

4.3 Model description

The posed hypothesis suggests that all shippers act independently of each other. To test this hypothesis we need to simulate this situation and test whether the simulation and reality correspond.

To simulate independence we impose independence on the shipper utilization rates. We take the original dataset with individual shipper utilization rates weighted by peak market share. We assume the utilization rate data as given. Then for every shipper we draw a different random hour in the year and take its corresponding shipper peak weighted utilization rate. Taking the summation over the different shippers gives a new total utilization rate. We do this 24*365 times, producing a simulated year of ’virtual’ utilization rates V_j.

One could make an LDC of V_j or only look at its maximum value V_jmax. As the maxi-mum is what interests us we only consider the maximaxi-mum. Up till here we have only done one simulation of this situation. This simulation can be done several times, drawing V_jmax from these simulations. This results in a sample of, say m simulated maxima. This procedure is called a Monte Carlo Simulation. A compact treatment is given in Johnston and Dinardo (1997).

These m simulated maxima can be represented graphically in a histogram which is done in Figure 15. So what is the interpretation of this histogram? If all shippers would have realized the observed individual shipper utilization rates over the year, but independent of each other - that is, the timing at which these utilization rates occur is uncorrelated - then the actual maximum utilization rate would likely fall somewhere within the confidence bounds of the histogram. In addition, based on a few assumptions, this histogram should resemble an Extreme Value Distribution which will be further discussed in section 4.4

This method can be applied to different datasets in different years and different timeperi-ods. Together, these analyses should be able to tell us whether our hypothesis holds true or that assessing correlations between shippers is a useful way to make inferences about the height of the attained peak utilization rate.

(42)

4.4 Fitting a Generalized Extreme Value Distribution

We have drawn a histogram of this collection of maxima, see Figure 15. The data are the m maxima of identically distributed independent variables so it can be proven that the distri-bution of this sample goes to the Generalized Extreme Value distridistri-bution in the limit. This distribution is characterized by three parameters. Location parameter µ, scale parameter σ and shape parameter ξ. The cdf is given by:

G(x) = exp[(−1 + ξ ∗x − µ σ )

−1/ξ_] ₍₁₇₎

Although theoretically the utilization rate is bounded, in practice the maximum never gets close to these bounds so we can assume this distribution is without bounds. This implies the use of the Gumbel distribution which is a special case of the Generalized Extreme Value distribution. The Gumbel distribution is the Generalized Extreme Value distribution with ξ → 0. The result is a distribution where the parameter ξ drops out so this distribution is characterized by two parameters. Location parameter µ and scale parameter σ. The cdf is given by

G(x) = exp[−exp(−x − µ

σ )] (18)

These parameters can be estimated by using the method of moments, the maximum like-lihood estimator or least squares. We have used method of moments which states that

E(X) = σ ∗ λ + µ (19)

V ar(X) = σ2π2/6 (20)

where λ = 0.577215 is Euler’s constant. This is a system of two equations and two unknowns which can be easily solved:

ˆ µ = X − λ ∗ se (21) ˆ σ = √ 6 ∗ se π (22) with se = v u u t m X j=1 (x − x m ) 2 ₍₂₃₎

(43)

the simulation with the fitted distribution and produce p-values to indicate the goodness of fit. We decide to use the Kolmogorov-Smirnov test for goodness of fit. A description of this test can be found in any graduate level statistical textbook. This test is based on measuring the maximum distance between the empirical distribution function and the Gumbel distri-bution, where the parameters of the Gumbel distribution are taken from the MM estimates. The empirical distribution function is defined as

Fm(x) = 1 m m X j=1 IVj≤x (24)

with IVj≤x the indicator function.

Then the Kolmogorov-Smirnov Statistic is: Dm = sup

x

|Fm(x) − G(x)| (25)

The null hypothesis that the data follows a Gumbel distribution would be rejected if Dm

exceeds a predetermined critical value. These critical values can be obtained from a table. Obtaining p-values can be done by interpolating between these critical values. p-values for the simulations of different time periods are reported in Table 3 in the Appendix.

Because the hypothesis of the data following a Gumbel distribution is rejected a few times, estimates for the Generalized Extreme Value distribution have also been obtained for all simulations. We have used a Maximum Likelihood estimator for the determination of the parameter estimates. The same Kolmogorov-Smirnov test applies in this situation and has been performed.

Now that we have discussed the methodology we are ready to discuss some results that were obtained by performing the Monte Carlo Simulations.

4.5 Results

4.5.1 The 2007 model

We have taken hourly data over 2007 for all 20 active shippers. Performing 500 simulations and evaluating the histogram in Figure 15 shows that the distribution of the simulated max-imum utilization rate seems to follow an extreme value distribution. It can be seen that the distribution of the simulated utilization rate maxima is found between 0, 80 and 0, 87. The observed maximum was 82, 3% which falls nicely within the graph. This is confirmed by the

(44)

shipper behaviour.

Since our hypothesis cannot be rejected for 2007 we can state that even if correlations be-tween shippers are present, these do not affect the maximum total utilization rate. This is an important conclusion as this would rule out any speculation about shippers peaking at the same time because of external factors, or shippers peaking unsimultaneously, acting as each other’s substitute because of a steady demand and supply situation.

The Kolmogorov-Smirnov test for 2007 data does not lead to rejection of the Gumbel Dis-tribution with D = 0.044 and p = 0, 279. Its parameters are estimated as ˆµ = 0.822 and ˆ

σ = 0.008. This results in a 95% confidence interval from 0.812 to 0.852.

This 95% confidence interval shows a spread of 4% in the simulated maximum utilization rate. This implies that if our assumptions about shipper independence are true and the indi-vidual shipper utilization rates are known, there is an uncertainty of about 4% in a prediction for maximum utilization rate.

In the next section we will present this analysis for the years before 2007.

(45)

4.5.2 Similar models for earlier years

For earlier years we have performed a similar analysis as for 2007. The figures of the sim-ulations are presented in Appendix 6.2.2. For a compact summary of the results that are presented in the Appendix data tables and graphs we have constructed Figure 16. This figure shows the results of the simulations for the years 2003 to 2007. The boxplots display the mean of the simulation up to one standard deviation in the box and up to two standard deviations by lines. In addition to that, the actual maximum utilization rate is depicted by the green dot and the worst case maximum utilization rate by the red dot.

Figure 16: Boxplot

(46)

of new capacity was offered to the market which changed the structure of the transport mar-ket. When inspecting Figure 32 we clearly see a deviation from the pattern we observe in the other histograms. The histograms for 2003, 2006 and 2007 show a clear extreme value distribution pattern emerging in the histogram. 2004 only slightly deviates from the extreme value distribution though not for any identified reason. However, 2005 shows a double peaked histogram which results in a high value for the standard deviation in the boxplot. To correct for this effect we decided to do a separate analysis of both periods in 2005 in the next section. The reason for this double peak is found in a significant increase in available cluster capacity during 2005.

This observation is confirmed by the Kolmogorov-Smirnov test for the Gumbel distribution in 2004 and 2005. Both years reject the hypothesis of the simulation being Gumbel distributed with a zero p-value. For the other years Kolmogorov-Smirnov does not reject the Gumbel distribution.

Having one comprehensive plot of the simulations over the 5 different years should give us insight in a possibly emerging pattern. However, no clear pattern can be observed from these simulations. We are tempted to conclude that every year has its own characteristics and therefore a different result every year.

However, we can make one interesting observation. The years where the simulation pre-dicts high maximum utilization rates (2003, 2005, 2006) the actual maximum utilization rate is lower than its prediction. The years where the simulation predicts lower maximum utiliza-tion rates (2004, 2007) the actual maximum utilizautiliza-tion rates fall within the range of predicted values. Since this observation is only based on 5 years any conclusion should be taken with care. Still, if we are to draw a conclusion from this analysis, it would be that during times of a high maximum utilization rate the market for gas might be tight and negative correlation between shippers ensues. During the years of a somewhat lower maximum utilization rate, competition for gas might be less fierce and a negative correlation vanishes.

For comparison we have selected a number of smaller time periods and reproduce the same analysis for these time periods.

4.5.3 Models for smaller time periods Models were fitted for

(47)

• March 2007 • September 2007

Figures for the Monte Carlo simulation (like Figure 15) are shown in Appendix 6.2.2. It is expected that over smaller timeperiods the spread in the maximum utilization rate simulation would decrease. This is also what is observed though not very convincing. A nice comparison can be made between September 2007 and the year 2007 since these simulations are based on the same peak market shares. The only difference is that the September simulation only uses a subset of the data.

We find that, for the estimation of the Gumbel distribution, ˆσseptember=0,008 and ˆσ2007=0,009.

There is hardly any difference between the estimated scale parameter of the two simulations. The σ for January and March 2007 are both 0,007 so also close to the year estimate.

The odd result that was obtained for 2005 in the previous section disappears when divid-ing the year in 2 parts. Although the Extreme Value Distribution for these two periods can still be rejected for some rejection levels, the double peaked shape that was observed for the year 2005 (Figure 32) is not observed for its separate periods. We also see a clear difference between the two periods. For early 2005 we find a high maximum utilization rate with a very small variance. For autumn 2005 we find a much lower maximum utilization rate but with a large variance. This is an intuitive result as an increase in available capacity in autumn 2005 results in a lower maximum utilization rate in autumn 2005.

It is interesting to see that the relation between high maximum utilization rates and neg-ative correlation that we found in the previous section still holds. In the analysis for these 5 time periods the period with the highest maximum utilization rate is also the one where the actual maximum utilization rate is lower than the simulated maximum utilization rate. This confirms our view that the higher the utilization rates, the more downward pressure on the actual maximum utilization rate is observed.

The next section puts the link between the Monte Carlo analysis and the analyses discussed in Section 2

4.6 Relation to the Unsimultaneity Index

(48)

We call the difference between the worst case peak utilization rate and the simulated mean peak utilization rate ’Randomness Unsimultaneity’ RU

RU = LW₁p− Vmax ₍₂₆₎

and the difference between the actual peak utilization rate and the simulated mean peak utilization rate ’Correlation Unsimultaneity’ CU

CU = LS₁p− Vmax ₍₂₇₎

so that RU − CU = U I again. These indices are presented in Table 2.

Table 2: RU and CU Year 2003 2004 2005 2006 2007 LS1 81% 78% 88% 82% 82% LW1 92% 83% 98% 94% 90% Vmax _85% _78% _93% _86% _83% DC 8% 17% 2% 6% 10% UI 11% 5% 10% 13% 8% RU 7% 4% 5% 8% 8% CU -4% -1% -5% -4% 0%

This table clearly shows the effect we described in the previous section. 2004 and 2007, the years with the lower simulated maximum utilization rate show a Correlation Unsimultaneity close to 0%. The years with the higher simulated maximum utilization rate have a Correlation Unsimultaneity of −4% to −5%, that is, they show a negative correlation.

4.7 Conclusion

(49)

rate. For these years we will have to conclude that shippers exhibit a negative correlation that decreases the maximum utilization rate compared to a situation of independent shippers. This is an interesting result on itself, though we can say more about it. That is, we find that the years 2004 and 2007 are the years where the simulated means of the maximum utilization rate are lower than the ones for the other years. This tempts us to conclude that a higher simulated maximum utilization rate causes downward pressure on the observed maximum utilization rate by causing negative correlation between shippers. This negative correlation might be caused by increased competition for gas on a market where the utiliza-tion rate is already high.

(50)

5 Conclusion

5.1 Summary and results

In this study an analysis of the utilization rate in the Emden Oude Statenzijl cluster has been conducted. The period over which the analysis was done is 2003 up to 2007. Over this period we have defined the hourly utilization rate as firm allocations over firm usage rights.

A number of hypotheses have been tested in this analysis. A negative correlation between the Herfindahl-Hirschmann Index and the Unsimultaneity index was posed. This hypothesis was based on the expectation that more shippers and more competition would increase un-simultaneity in shipper peak utilization. This hypothesis has been explored but no obvious relation was found. Also a relation between the Herfindahl-Hirschmann Index and the maxi-mum utilization rate was not found.

Then we assumed that shipper utilization rates respond to common external factors. This was tested by a Time Series Factor Analysis since no obvious exogenous variables could be identified in advance. A TSFA would be able to expose these factors. However, dependence on common factors was not found. That is, the analysis does not determine realistic factors which all shippers respond to. We draw the conclusion that the shippers are not responding to common factors.

We further refine this statement by posing that, for the realization of the maximum utilization rate, it can be assumed that the shippers act independently. This is tested by performing a simulation based on this hypothesis. No unambiguous result for the 5 years on the inde-pendence of shipper behaviour is found. It is found that for three years correlation between shippers caused observed maximum utilization rate to be lower than it would have been under independence. For two other years we cannot reject the hypothesis of independence which leads us to conclude independent shipper behaviour for these two years.

The result that we find can be summarized in the following quote:

In the years 2003, 2005 and 2006 behavioural patterns in utilization rate caused the maximum cluster utilization rate over a year to be lower than it would have been if no behavioural patterns had existed.

A utilization rate analysis of import cluster Emden Oude Statenzijl.