Improving irradiance forecasting to make solar energy better

(1)

University of Amsterdam

Masters Thesis

Improving irradiance forecasting to make

solar energy better

Author:

Dennis Knol

10791728

Daily Supervisor:

Fons de Leeuw

Examiner:

Dr. Valeria Krzhizhanovskaya

Assessor:

Dr. Jan Fokke Meirink

A thesis submitted in partial fulfilment of the requirements

for the degree of Master of Science in Computational Science

University of Amsterdam November 13, 2020

(2)

Abstract

The primary goal of the project is to improve short-term forecasting of irradiance to acceler-ate the usability of solar energy. This work compares the stacceler-ate-of-the-art implementations of the Weather Research and Forecast (WRF) model with Convolutional Gated Recurrent Unit (Con-vGRU) models and studies under what circumstance the models perform best. Ultimately, the aim is to combine the strengths of various methods and provide the most accurate irradiance forecast for the next 6 hours on an area covering the Netherlands.

Two implementations of the WRF model were evaluated for case days in the summer of 2019. The models are configured explicitly for solar power applications and compared to two ConvGRU models. The ConvGRU model has been shown to outperform both extrapolation-based methods and an operational NWP system in the precipitation domain. For the work, the model is trained on sequences containing irradiance data from the Meteosat Second Generation Cloud Properties (MSG-CPP) dataset. Additionally, we design an extension to the model, enabling the model also to exploit geographical data. An optical flow method served as the benchmark.

The experimental results show that the ConvGRU outperforms the other methods in all weather conditions and improves the optical flow benchmark by 8.76% in terms of MAE. Moreover, the ConvGRU prediction samples demonstrate that the model suffers from a blurry image problem, which causes cloud structures to smooth out over time. The optical flow model is better at representing cloud field throughout the forecast. The WRF model suffers on the short range of the simulation and produces the least accurate irradiance forecasts. Further research is needed to find the optimal forecasting system that provides the best irradiance forecast on the 0-6 hour horizon.

(3)

1 Introduction

To help constrain global warming, the energy produced by fossil fuels is being replaced more and more by renewable energy. This transition accelerates by the fact that wind and solar are now cheaper than energy from traditional resources [1, 2]. However, in terms of usability, wind and solar are not yet fully competitive due to their variable nature. There is a need for more accurate weather forecasts to increase the usability of solar.

This need stems from the works of the energy market. The energy supply must continuously match demand, keeping the grid in balance. As conventional power plants are being replaced by renewable resources, balancing the grid is becoming increasingly complex [3, 4]. An unpredictable supply of renewable energy is replacing the supply from sources such as nuclear, coal and gas. It is the unpredictability of renewable sources such as wind and solar, making it increasingly challenging to match supply and demand continuously.

One of the critical challenges is forecasting solar irradiance at the Earth’s surface. When it comes to solar irradiance forecasting, clouds are the most important driver and notoriously challenging to predict [5]. More traditional forecasting methods to forecast clouds divide roughly into two classes: image-based extrapolation methods and Numerical Weather Prediction (NWP) based methods. Ex-trapolation based methods perform well on a small temporal scale, but accuracy decreases markedly for increasing temporal scales as these methods fail to predict how clouds evolve. For larger temporal scale, the use of NWP models generally improves the forecast accuracy, particularly for clear sky con-ditions [6]. However, performance decreases for partly cloudy and cloudy concon-ditions as NWP models do not explicitly resolve sub-grid cloud processes. Unresolved processes are parameterized and add a source of uncertainty to the model.

More recent studies propose to address these limitations from a machine learning perspective, taking advantage of the vast amount of weather data available. These studies formulate nowcasting as a spatiotemporal sequence forecasting problem and specifically focus on nowcasting precipitation, using Recurrent Neural Networks (RNN). Shi et al. [7] propose a novel Long Short Term Memory (LSTM) model with convolutional layers to capture spatial correlation and outperforms the optical flow based ROVER algorithm. A follow-up study proposes a convolutional Gated Recurrent Unit (ConvGRU) [8], an architecture which is less complicated and more efficient to train. Google’s MetNet [9] is the first deep learning model to produce more accurate precipitation nowcasts than NWP.

Currently, state-of-the-art operational models are using optical flow algorithms, but that ignores the growth and decay of clouds. In this project, we aim to address this problem implementing two additional models. First, we implement the Weather and Research Forecast (WRF) model designed to meet the growing demand for specialised numerical forecast products for solar power applications. Second, we implement and experiment with a Convolutional Gated Recurrent Unit (ConvGRU) model trained on satellite imagery.

(6)

At present, methods used to make short-term solar predictions are not accurate enough, leading system operators to take a conservative approach to decide how much spinning reserve1 _{to schedule.}

This project aims to improve the nowcast accuracy and maximise the exploitation of renewable energy. The overall research question for this project is: Can we improve the state-of-the-art nowcasting accuracy using Numerical Weather Prediction or Artificial Intelligence?

To answer this question, we first implement the WRF model and compare two specific config-urations. After that, we train the ConvGRU model on satellite data to study whether this model can improve nowcasting accuracy and introduce an extension to the model, designed explicitly for additional geographical features. To conclude this study, we compare the WRF model and ConvGRU forecasts with optical flow output. The optical flow algorithm is developed and implemented by Dexter Energy Services and used as the benchmark.

We also study the strengths and weaknesses of each method. A method’s performance depends on the weather conditions (e.g., clear sky vs cloudy) and different models perform well on different tem-poral scales. For this project, the goal to find under what particular circumstances and spatiotemtem-poral scales the models perform best.

The thesis is structured as follows. First, we explain the theory and discuss relevant literature. Subsequently, we present the methodology and explain the setup of the experiments. A discussion of the results and the conclusion follow.

1_{At all times, the Transmission System Operator (TSO) keeps active power in reserve. When supply does not match}

(7)

2 Literature Review

In this chapter, we discuss relevant background information. To fundamentally understand this study’s relevance, we first briefly explain the organisation of the Dutch energy market. Subsequently, we discuss radiation, the physical properties of clouds and three specific modelling techniques. A discussion of the latest advances in solar nowcasting concludes this Chapter.

2.1 Energy Market

To start, we give an overview of the Dutch energy market and discuss challenges arising from the transition towards renewable energy. We discuss the functioning and organisation of the energy market and how energy trading ensures a balanced grid.

2.1.1 Energy System Design

The Dutch energy market is governed by a single Transmission System Operator (TSO) and is entirely open to competition. The Dutch TSO, TenneT, is responsible for keeping the grid in balance through a market mechanism and is the central hub for all transactions [10]. Maintaining a continuous balance in supply and demand is essential as storing power is costly and inefficient.

To ensure a balanced energy market, market players can buy and sell electricity. Trading can be done way in advance on the futures market, on the day-ahead market and the intra-day market [11]. Most volume is traded on the day ahead market, and an increasing amount of energy is traded on the intra-day markets as predictability of renewable resources becomes higher closer to delivery. Exemplifying this shift is the significant increase in the intra-day trades from 2018 to 2019 in the Netherlands. The volume of traded energy on the intra-day market increased by 57% during that year [12].

The shift from the day-ahead market towards to intra-day market is a direct consequence from the shift towards renewable sources. Market participants are changing portfolio optimisation strategies as the share of variable renewable energy increases. Increasingly, energy is being traded on the intra-day as for both the supply, and demand-side forecasts are more accurate when compared to the day-ahead. The more accurate forecasts are essential for market participants to buy or sell the right amount of energy, keeping the grid in balance.

2.1.2 Imbalance Market

In practice, however, it is not possible to precisely forecast both the demand and supply of electricity. As a consequence, supply and demand are often not entirely in balance, and the growth of energy generation from variable renewable sources amplifies the imbalance.

(8)

Ensuring the grid is in balance is the TSO’s primary responsibility. The TSO counteracts this imbalance by selling electricity to market participants with a shortage and buying from participants with excess electricity. This market behaviour counteracts the under- or oversupply to the grid [13].

The trading takes place on the imbalance market, where all parties causing an imbalance receive a financial penalty. This penalty called the imbalance price and is determined by the size of the deviation from the power forecasts; in other words, the imbalance volume. The higher the imbalance volume, the higher the imbalance price [14]. The imbalance price incentivises market participants to counteract imbalance and stimulates parties to forecast the supply and demand for energy more accurately.

2.2 Atmospheric Physics

In this work, we focus on forecasts for solar power applications. We next describe atmospheric effects, cloud physical properties, and aerosols and how these affect various solar irradiance components. This section also forms the basis for understanding individual modelling concepts such as parameterization: a method for modelling small-scale processes, like evaporation and condensation.

2.2.1 Solar Radiation and Atmospheric Effects

The primary source of energy to the Earth is radiant energy from the Sun. The Planck-Einstein relation describes the incoming electromagnetic radiation in (1), where c is a constant representing the speed of light, h Planck’s constant and λ wavelength. From the relation, it is clear that the shorter the wavelength, the higher the energy.

E =hc

λ (1)

For this research, the focus is on the shortwave radiation reaching the Earth’s surface, which on average, is only half of all incoming radiation [15]. The radiative transfer through the atmosphere is strongly affected by its local composition, and the amount of radiant energy reaching the surface varies widely due to absorption and scattering as a result of local variations in clouds, water vapour and pollution. The intensity of solar radiation reaching the surface is called irradiance and measured in W/m2_{. Latitude of location, the season of the year and time of the day are also important drivers}

determining irradiance.

The air in the atmosphere contains liquid, solid particles and is composed mostly of gasses. Particles in the atmosphere reduce the Sun’s energy by scattering the incoming radiation. Scattering is caused by the interaction between particles and radiation, redirecting the incoming beams from their original path. The extent to which the radiation scatters depends primarily on its wavelength and the number of particles in the atmosphere. Also, the distance travelled by the incoming radiation is an essential factor.

(9)

Scattering from particles which sizes are smaller than the wavelength of the light, e.g. gas molecules, is called Rayleigh scattering. Rayleigh scattering is wavelength dependent, affects short wavelengths strongest and occurs primarily in the upper atmosphere, where smaller particles are more abundant. For particle sizes larger than the wavelength, the scattering is referred to as Mie scattering. Examples are dust particles and aerosols. Contrary to Rayleigh scattering, the latter is not wavelength-dependent, meaning that various wavelengths are affected equally. Mie scattering mainly occurs in the lower atmosphere, where larger particles are more numerous than in the upper atmosphere.

The other atmospheric effects are reflection and absorption. Particles of liquid and frozen water cause reflection. Gas molecules, dust and aerosols cause absorption. Gasses, mainly ozone, water vapour and carbon dioxide, have very high absorption but do not strongly affect overall power. The primary power reduction is caused by air molecules and dust absorbing solar energy. The absorbed radiation is converted into thermal energy, warming the particle. The cause the particles to emit radiation in the longwave band. A visualisation of the atmospheric effects on radiation processes is presented in Figure 1.

Figure 1: A representation of phenomena that influence shortwave and longwave radiation. The figure illustrates reflection, absorption and scattering. Figure from Meted.

As a consequence of the atmospheric effects, solar radiation breaks down into various components. In this study, we follow [6] and consider three solar irradiance components. The Direct Normal Irradi-ance (DNI) component is the irradiIrradi-ance measured at the surface area of the earth perpendicular to the Sun. The irradiance that has been scattered by particles in the atmosphere is called Diffuse Horizontal Irradiance (DHI), and the total irradiance of the Sun on a horizontal surface is referred to as Global Horizontal Irradiance (GHI). The latter is computed with DNI and DHI. Equation (2) describes this relation, where z is the zenith angle to the Sun.

(10)

2.2.2 Clouds and Aerosols

On average, clouds cover about two-thirds of the globe [16] and have a profound impact on solar radiation reaching the Earth’s surface. Assessing the exact effect of clouds on radiation is challenging, as clouds are highly variable and influence the transfer of radiation through the atmosphere in numerous ways.

Clouds are composed of liquid water, ice or both and vary widely in optical properties and height. Different optical thicknesses and heights affect radiative transfer in different ways. Thick clouds, for example, both low and high, enhance the albedo and primarily reflect incoming shortwave radiation. The reflectivity of a cloud can range from 40 to 90%.

Clouds that are thin and high up in the atmosphere primarily transmit shortwave radiation and absorb longwave radiation emitted from the surface of the Earth. The clouds absorb or re-emit long-wave radiation depending on the conditions. The role of clouds on the radiative transfer is visualised in Figure 1, where clouds scatter, reflect and absorb incoming shortwave radiation and absorb and re-emit longwave radiation.

Fundamental to clouds is the presence of aerosols. Aerosols are tiny liquid or solid particles sus-pended in the air and can be from both natural and anthropogenic sources. The particles act as con-densation nuclei and form surfaces for condensing water molecules. Without the presence of aerosols, clouds are unable to form.

The interaction of aerosols with cloud particles affect the transfer of solar radiation, as interaction affects the life cycle of clouds. This interaction is known as the aerosol indirect effect on radiation. Aerosols also directly affect radiation trough interactions like absorption and scattering. This is known as the aerosol direct effect. The direct effect of aerosols on the weather is better understood and quantified than the indirect effects. The latter is very complex and one of the primary sources of modelling uncertainty.

Figure 2: Schematic representation of the direct and indirect radiative effect of aerosols. Figure from [17]

(11)

2.3 Modelling Methods

On clear days, solar irradiance forecasts are relatively accurate, but skill decreases significantly when clouds are present. Clouds are notoriously difficult to model due to their high variability, and the wide range of scale spanned by cloud processes. In numerical modelling, for example, not all processes can be resolved. Generally, small-scale interactions on cloud dynamics are accounted for in parameteriza-tions. The parameterization is the representation of microphysical processes in climate models using a simplified physical representation [16].

In this research, we implement various techniques to make nowcasts and assess the strengths and weaknesses of each model. A similar study is conducted by Miller et al. [18]. This study compares three individual methods and finds that a method’s performance depends on both the spatial and temporal scale. Performing most accurately on the temporal scale of 0-0.5 hours is information coming from persistence [19], which assumes the shape and structure of a cloud do not change over time. With this assumption, the trajectory of the cloud is modelled.

For cloud forecasts for the next 0.5-3 hours, satellite-based methods are found to be most accurate. These methods rely on imagery from geostationary satellites and forecast the cloud’s future trajectory by extracting similar points from consecutive cloud images. From these points, the position of a cloud in the image is used to generate a vector field. Such methods, however, are not well suited for modelling clouds that quickly dissipate or grow after initialisation. For example, for orographic clouds and clouds in coastal areas and forecasts several hours ahead, e.g. 3-6 hours, predictions from NWP models are most accurate. The NWP based model describes the atmospheric dynamics and models the dissipation and formation of clouds, rather than only advecting the cloud fields the model is initialised on.

For this study, we implement three methods to nowcast solar irradiance. The first model we implement is an NWP model called the Weather Research and Forecast (WRF) model. The latter two techniques rely on imagery from geostationary satellites: optical flow and a deep learning model. In the following, we discuss the latest advances in the space of nowcasting for all three methods.

2.3.1 Numerical Weather Prediction

Numerical Weather Prediction (NWP) is based on fundamental laws of physics and consists of equations for the conservation of energy, mass and momentum. These models use current observations to forecast the future state of the atmosphere in discretised space and time, meaning that the atmosphere is divided into boxes, and time into timesteps. The size of theses boxes is referred to as model resolution and all properties are assumed to be uniform in each box. For each box, equations are integrated numerically forward in time from initial and boundary conditions. The equations are solved numerically, because it is mathematically intractable to obtain solutions analytically, and are highly sensitive to these initial conditions as initial errors grow during the forecast. [20, 5]. The sensitivity to the initial conditions is studied from an ensemble of deterministic simulations within slightly different initial conditions.

(12)

Figure 3: A visualisa-tion of various physical processes essential to the weather. Not all of these processes are explicitly re-solved in NWP models. Parameterizations repre-sent unresolved processes, such as shallow convec-tion and shortwave radia-tive transfer. Figure from [5].

The set of equations governing the NWP framework consists of equations describing atmospheric flow and temperature. These equations are Euler equations describing momentum, the first law of thermodynamics for energy, balance equations for the mass of moisture and air and the ideal gas law for state. Upon these equations, NWP describes future changes in the atmosphere of wind, pressure, density and temperature [5].

A direct consequence of the finite grid scale of NWP models is that not all processes can be resolved by the governing equations. Processes that occur on a smaller scale or that are too complex are represented by parameterizations. These parametrizations are simplified models takes what is known on a large scale and use that to describe the small scale that the governing equations can not resolve. Radiation, the formation of clouds and precipitation are examples of physical processes parameterized by the NWP models. Figure 3 is a schematic representation of physical processes that are parameterized. The parameterized processes in the WRF model are discussed in Section 3.1.2. The representation of small scale dynamics with parameterizations is one of the main sources of uncertainty in weather models. In particular, clouds and convective precipitation are very challenging to represent accurately.

The model used in this study is the Weather Research and Forecasting model (WRF), which is an open-source NWP system [21] capable of simulating at resolutions high enough to model weather phenomena. Generally, the model is used for simulations on resolutions of a few tens of kilometres, i.e. the mesoscale range, and on resolutions of a few kilometres, i.e. the convective scale [21]. For solar power applications, horizontal resolutions between 1km and 15km are most often used. Weather models that operate on global scales, such as GFS (3.4.1), simulate on a coarser resolution, typically at a horizontal resolution around 30 kilometres.

(13)

2.3.2 The Weather Research and Forecasting model

The Weather Research and Forecasting model, in short WRF, is an open-source model developed and maintained by the NCAR, NOAA and NCEP and is used both in production and for research. In this work, we use version 4.2 of the WRF model, which includes physical schemes specifically designed for solar energy applications. The physical schemes upgrade the standard WRF model for solar power applications by including irradiance parameters and by improving the representations of feedbacks between solar irradiance, clouds and aerosols. In this section, we elucidate the improvements and discuss the latest literature on nowcasting with the WRF model.

The improvements were introduced by Jimenez et al. [6], who developed an augmentation of the WRF model called WRF-Solar and present upgrades to the WRF model that contribute to making the model appropriate for solar power forecasting. Recently, the complete WRF-Solar suite had been added to the standard WRF model [22]. One of the upgrades is that the Solar augmentation outputs multiple solar irradiance components relevant to the solar industry. These parameters are Global Hor-izontal Irradiance (GHI), Direct Normal Irradiance (DNI) and Diffuse (DIF) irradiance components. Additionally, the components are provided at every model time step, wherein the standard WRF model, radiation components are computed only once per 10 model steps.

The standard WRF model ignores the effect of atmospheric aerosols on radiation. The lack of atten-tion in earlier research to aerosols stems from the fact that the radiative effect of aerosols is relatively small. The WRF-Solar model incorporates aerosol-radiation interactions by updating two particular shortwave parameterizations. The updates account for the aerosol direct effect and better represent the direct and diffuse radiation components. For the assessment of the updated parameterizations, the performance is measured with an emphasis on clear-sky conditions because the aerosol direct effect is the largest source of uncertainty under these conditions. More accurate modelling of irradiance under clear-sky conditions is the first step to more accurate modelling under all-sky conditions.

Lastly, WRF-Solar improves the representation of cloud-aerosol feedbacks and cloud-radiation feed-backs. The improvements regarding the cloud-aerosol feedbacks, in other words, the aerosol indirect effect, also stems from an updated parameterization. Combining this parameterization with the one incorporates the aerosol-radiation interactions, ensures that the first and second aerosol indirect effects are accounted for. An update of the shallow cumulus parameterization upgrades the feedback between shortwave radiation and clouds on the sub-grid scale and improves the representations of cloud-aerosol feedbacks. In Section 3.1.2, we discuss the parameterizations, and how certain combinations of param-eterizations work together, in more detail.

One of the studies that implemented the WRF model to nowcast solar irradiance is a case study that combined several nowcasting methods [19]. The research aims to create a nowcasting system combining the strengths and weaknesses of various methods. For the case study, the researchers compared four specific configurations of the WRF model and found that a horizontal resolution of

(14)

9km in combination with a shallow cumulus parameterizations results in the most accurate GHI forecasts. The results show that the physical schemes for aerosol-radiation interaction, updated by the WRF-Solar augmentation, reduce the errors in clear-sky conditions. Also, the updated shallow cumulus scheme reduce GHI errors significantly in cloudy conditions when compared to WRF without the shallow cumulus parameterization.

A more recent study compares three WRF model configurations to the forecasts of the Global Forecasting System (GFS) and finds that all three configurations have lower forecasting errors than GFS [23]. The study’s output also shows that the WRF models performance correlates with that of GFS, which provides the initial and boundary conditions to the WRF model, and that all four models output largely overestimate irradiance.

When comparing the individual WRF configurations, the study finds that that the performance of a configuration depends on the sky conditions. As stated by [6], the aerosol direct effect is most pro-nounced under clear-sky conditions and thus modelled best by the WRF-Solar model, which accounts for the aerosol direct effect via the shortwave parameterizations. The standard WRF model with the Dudhia shortwave radiation scheme performs slightly better in cloudy and fully overcast conditions, and the WRF model configures with the RRTMG scheme performed best overall.

2.3.3 Optical Flow

Currently, the method most widespread nowcasting technique for solar power applications is Optical Flow [24, 25], as it combines accurate short-term forecasts with low computational requirement, es-pecially when to NWP. For solar nowcasting, optical flow is a technique used to predict the cloud’s trajectory using imagery from satellites and radars. Using consecutive images, optical flow algorithms infer motion patterns and compute displacement vectors, more specifically, Cloud Motion Vectors (CMV). The cloud’s future trajectory is estimated based on these CMVs.

A significant limitation of optical flow is the assumption that pixel values in consecutive images are constant, meaning that the tracked object does not change shape. Consequently, physical processes fundamental to the evolvement of clouds, such as orographic lifting and convergence, are not captured by optical flow. Also, optical flow assumes that motions are small and that clouds propagate at similar speeds as in the past.

Similar to [9], in this study, optical flow is used as a baseline model to set a benchmark. The study compares various algorithms by RainyMotion [26] and finds that the Fast optical flow using dense inverse search [27] performs better than the classic algorithms, such as the Lukas-Kanade algorithm [28]. Section 3.2 provides a technical description of the baseline model used in this study.

(15)

2.3.4 Deep Neural Networks

One of the most recent advances in the space of nowcasting is the use of Deep Neural Networks (DNN). Contrary to Optical Flow and NWP, deep neural networks can exploit the large amount of data collected continuously from ground-based camera’s, radar’s, weather stations and satellites. Also, improvements in dedicated hardware, such as GPUs, and the development of more efficient model architectures accelerate the use of machine learning techniques.

Most progress is made in nowcasting precipitation. One of the early studies tackling the nowcasting problem from a deep learning perspective is [7]. In this study, researchers formulate the nowcasting problem as a spatiotemporal sequence forecasting problem and introduce an end-to-end trainable model, where both input and output are spatiotemporal sequences. The proposed model architecture is a Convolutional Long Short-Term Memory (ConvLSTM), which captures temporal dependencies using the LSTM cells and spatial dependencies with the convolutional layers. The model is trained using 812 days of radar data and outperformed a state-of-the-art operational optical flow algorithm called ROVER.

The follow-up study [8], led to the emergence of two more accurate model architectures: Convo-lutional Gated Recurrent Unit (ConvGRU) and Trajectory Gated Recurrent Unit (TrajGRU). The first updates the ConvLSTM by replacing the LSTM cells with Gated Recurrent Unit (GRU) cells. The latter differs from ConvLSTM and ConvGRU. In the TrajGRU model, convolutions are used to generate flow fields. This enables the model to learn the location-variant structures and capture spatiotemporal correlations more efficiently.

Figure 4: Example of an encoder-decoder, sequence-to-sequence architecture with RNN cells [20]. The circles represent the RNN cells, and the dotted arrows are the recurrent operations. The solid arrows represent the input and output of the cells. In the example, three input frames are used to predict two future frames.

(16)

More recently, Google’s researchers introduced MetNet and show that the model is capable of outperforming the current operational NWP by the National Oceanic and Atmospheric Administration (NOAA), High-Resolution Rapid Refresh (HRRR), at predictions up to 8 hours. MetNet is a neural network model trained on both radar and satellite data. The data are first processed by a convolutional LSTM, second by axial attention layers. The attention mechanisms allow the model to ignore some parts of the data and focus on others and enable the model to learn long-term dependencies. The output of the architecture is a probability distribution, giving a probability a specific rate of precipitation in each grid cell.

To our knowledge, solar nowcasting based on satellite data and deep learning techniques has not been covered by literature yet. The published literature on nowcasting irradiance with DNNs is based on sky-images retrieved from ground-based camera’s [29, 30]. This approach limits the lead time of the forecast because the ground-based camera’s cover only a small geographic area. Consequently, models trained on such data can only consider the possible motion of the clouds over a very short period. Using satellite data, we can model larger geographical areas and generate forecast further ahead in time.

(17)

3 Methodology and Technical Description

We next describe the technical details of the methods used to nowcast solar irradiance. We discuss each method individually. After that, we provide a discussion of the data we use and describe how we aim to assess the forecast accuracy.

3.1 WRF Model Description

For this research, we use version 4.2 of the Advanced Research WRF (ARW). This is one of two dynamical cores the WRF system provides and contains a large variety of physics schemes and dynamics options. In this section, both a technical and practical description of the model is provided. First, we give an overview of the model and discuss initialisation and lateral boundary conditions. Subsequently, we discuss the essential physics schemes and implementation of the WRF model.

Figure 5: Flowchart showing the data flows between the individual components of the preprocessing system and the WRF model.

The ARW framework consists of two main bodies: the preprocessing system and the WRF model itself. We describe the individual programs of ARW following the model overview presented in Figure 5. The first part of the flow chart is the WRF Preprocessing System (WPS) and consists of three programs, each performing a single stage of data preparation. By preparing the input data, the WPS provides the model domain, the initial state of the atmosphere and the lateral boundary conditions for the WRF model.

The first modelling step is to specify the computational domain. Based on terrestrial data provided by the NCAR [31], the geogrid program generates the geographical grid with the horizontal resolution and projection type specified by the user. Within the model domain grid, the program horizontally interpolates static geographical variables, such as elevation and monthly vegetation onto the grid. The grid can be specified with one out four available map projections: the Lambert conformal, polar stereo-graphic, Mercator, and latitude-longitude projections. The Lambert conformal best for mid-latitudes [21] and used in this research. What resolution is best depends on the use case, can vary widely from one application to another and partly determines how the WRF model is the best setup. Parameterizations exemplify this; at coarse resolutions, most physical processes are not explicitly

(18)

resolved, making parameterizations essential. When considering clouds, a general rule of thumbs is to use cumulus parameterizations at resolutions around 10 kilometres and coarser. Within the 1 to 10 kilometres range, there is no clear consensus on whether cumulus convection is resolved or better represent by parameterizations. For resolutions finer than 4 kilometres, however, is it recommended not to use cumulus parameterization.

Figure 6: The two nest configurations the WRF model allows [21]. The nests on in the left images and telescopic nests, where grid 1 is parent to grid 2, grid 2 is parent to grid 3 and so on. The configuration on the right is an example of two nests on the same level, meaning that they have the same parent grid.

The model also supports horizontal nesting. This allows the user to introduce additional grids into the simulation with finer resolutions. It is recommended to increase the model resolution from a parent domain to the nested domain by a ratio of 3. For example, a 27 kilometres resolution for the parent domain and a 9 kilometres resolution for the nested domain. Important to note is that the outer cells of the computational grid are inaccurate as the model has to adapt to the new resolution. Lastly, the model provides a choice between one-way or two-way nesting. We implement two-way nesting, meaning there is feedback between domains. This is not the case for one-way nesting.

The second program is to give the model a starting point from which to run a forecast. This initial state of the atmosphere is provided by ungrib, which converts the large-scale meteorological input data, GRIB-formatted, into an intermediate format. Subsequently, metgrid interpolates the meteorological fields from the ungrib’s intermediate files (with a coarser resolution) to the finer resolution of regional WRF model and matches the meteorological fields to model grids defined by geogrid.

Once all the WPS steps are performed, the prepared data is inputted into the real program of the ARW system. The data now consists of both 2- and 3-dimensional fields, such as topography height and relative humidity, respectively. The real program defines the vertical layers by vertically interpolating the data onto the model coordinates. It is recommended to use at least 30 vertical layers with a model top at 50 hPa, which is approximately 20 km. Contrary to the horizontal resolution, the vertical resolution is not constant over the whole domain. The resolution is finer at the bottom and gradually increases towards the top of the domain.

By defining the vertical grid, the real program builds the initial conditions and the lateral boundary conditions for the wrf program. For the parent domain, the lateral boundary conditions are derived

(19)

from the external files, i.e. the inputted meteorological data. The parent domain is then used to derive the lateral boundary conditions for the nested domain. The last step is to run the wrf program, which solves the dynamic and thermodynamic equations of the atmosphere using a dynamical solver and generates a weather forecast.

3.1.1 Governing Equations

The ARW numerically integrates the moist Euler equations, which govern the flow and represent conservation of mass, momentum and energy. The equations have the flux form, are based on the variables defined below and are presented in (4)-(6). We use the following prognostic variables

V = µdv = (U, V, W ), Ω = µdω, Θm= µdθm, Qm= µdqm (3)

where v = (u, v, w) are the horizontal and vertical velocities, θm the potential temperature and

ω the vertical velocity for moisture. The parameters qm = qv, qc, qr··· represent the mixing ratios of

water vapour, cloud water, rain water and other moisture variables. In the following, we define the flux-form Euler equations [21]. Equations of motion

∂tU + (∇ · Vu) + µdα∂xp + (α/αd) ∂ηp∂xφ = FU (4a)

∂tV + (∇ · Vv) + µdα∂yp + (α/αd) ∂ηp∂yφ = FV (4b)

∂tW + (∇ · Vw) − g [(α/αd) ∂ηp − µd] = FW (4c)

First law of thermodynamics (conservation of energy)

∂tΘm+ (∇ · Vθm) = FΘm (5)

Conservation of mass (continuity)

∂tµd+ (∇ · V) = 0 (6a)

∂tQm+ (∇ · Vqm) = FQm (6b)

In the above equations, α is the inverse of the air density, α = ρ−1, αd the inverse density of dry air

and the relation between both is α = αd(1 + qv+ qc+ qr+ qi+ . . .)−1. The dry hydrostatic pressure

equation is ∂ηφ = −αdµd and the equation of state is defined as

p = p0

Rdθm

p0αd

γ

(7) where R the gas constant and γ = cp/cv = 1.4 the heat capacity ratio.

(20)

3.1.2 Model Physics

The WRF model can be configured using a wide variety of physics options that interact with the model’s dynamics and thermodynamics core. These physics options, i.e. parameterizations, account for sub-grid processes such as microphysics, cumulus convection and atmospheric radiation. In this section, we describe the main categories and the options used in the configurations evaluated for this research. Broadly, there are five categories: microphysics, cumulus parameterization, planetary boundary layer (PBL), land-surface model, and radiation [21, 32]. Figure 7 presents these categories and illustrates the interactions between parameterizations with one-directional arrows.

Figure 7: A schematic overview of the direct interactions of parameterizations in the WRF model [32].

Microphysics. This scheme describes unresolved cloud processes. The parameterizations in this category handle cloud processes on scales much smaller than the model resolution, such as droplet formation and account for phase changes of moisture. These schemes also include the transport of latent heat that is released due to phases changes, evolution and interaction of water and ice particles, and fall-out of precipitating particles. Parameterization of these microphysical processes is a major challenge, as they are complex and poorly understood [16]. In total, the WRF model provides 19 different microphysics schemes and the scheme implemented for this study, is the aerosol aware Thomson micro-physics scheme [33]. The scheme is developed for mid-latitude convective, orographic and snowfall conditions and for simulations at convection-permitting grid-scales.

(21)

Cumulus Parameterization. These schemes account for unresolved convection and represent the sub-grid effects of convective and shallow clouds, which is essential at coarser model resolutions. Typ-ically, cumulus parameterizations are used at resolutions around 10 km and coarser. The parametriza-tions control unresolved updrafts and downdrafts, i.e. vertical transport driven by latent heat within convective clouds. Shallow cumulus parameterizations account for the transport in shallow clouds. There are ten cumulus parameterizations available or can be disabled for simulations at fine resolu-tions, e.g., ≤ 4 km grid.

For this research, we implement the updated Kain-Fritsch scheme [34]. The scheme is based on a simple cloud model with moist updrafts and downdrafts and represents the convective clouds that are not resolved at the used model resolution. Additionally, we implement the Deng Shallow cumulus [35]. This scheme considers shallow clouds with nearly neutral buoyancy and with buoyant updraft and transitions to Kain-Fritsch scheme in unstable conditions.

Land-surface Model. Takes information from all the other parameterizations and derives moisture and heat fluxes at the surface. The schemes provide fluxes over both land points and sea-ice points and updates land-surface variables, such as soil temperature and snow cover. There are four available schemes, and the one used for this study is the Noah Land Surface Model [36]. This scheme covers the top two meters of the land surface in four individual layers and models soil temperature, soil moisture, evapotranspiration, soil drainage, and runoff. Also, it considers seasonality in all vegetation categories.

Planetary Boundary Layer. These parameterizations handle vertical sub-grid surface heat and moisture fluxes due to transport by eddies. Eddies are circulating flows in a different direction than the general flow and caused by obstruction. The planetary boundary layer (PBL) schemes determine the heat and moisture fluxes in the whole atmosphere, not just the planetary boundary layer. The WRF model provides 12 schemes, and alternatively, the scheme can be deactivated. The Mellor-Yamada-Janjic (MYJ) PBL scheme is used in this research.

Radiation. These schemes are responsible for parameterization of processes heating the atmosphere and the soil. Both shortwave and longwave radiation schemes are provided. The shortwave schemes provide downward shortwave radiation from the sun as the only source, and the longwave schemes account for the longwave radiation emitted by gasses and surfaces. The schemes used in this study are the schemes updated for the WRF-Solar suite [6]. The updated schemes compute and output the diffuse and direct components of solar radiation at every time step and account for feedback between radiation, clouds and aerosols. For this study, we implement the Rapid Radiative Transfer Model (RRTMG) [37]) longwave and shortwave scheme.

(22)

3.1.3 Variational Data Assimilation

In atmospheric research, Data Assimilation (DA) is a statistical method used to combine data from an NWP model forecast with observational data. For example, a model forecast from the GFS model combined with observational irradiance data from a geostationary satellite. The aim of combining both is to provide a better estimate of the initial state of the atmosphere through iterative minimisation of a cost function [21]: J (x) = Jb(x) + Jo(x) = 1 2 x − x bT B−1 x − xb +1 2(y − H(x) T (R)−1(y − H(x)) (8) In the above function, the state x that optimises the cost function is found through iterative minimisation. The state x is the vector containing grid values describing the model state, xb is the background state (or first guess), and the observations are denoted as y. The matrices B and R are the covariance matrices of background error and the observational error respectively, and H converts the state vector to the observational space.

For solar power applications, data assimilation is a very complex process because clouds are the primary driver determining the solar radiation reaching the surface. Generally, clouds are derived from satellites or radars using an optical depth threshold. For satellite and ground-based detection, it is difficult to detect individual cloud layers and hard to derive layers overlapping at different levels. Also, the exact location of a cloud in a 3D space is unknown as real-time cloud data provides only 2D information on current clouds. Consequently, the location in a 3D space is estimated, introducing an additional source of uncertainty to the model. Due to the complexity of the technique, the ambition to study the use of deep learning techniques and time limitations of a thesis project, data assimilation is not within the scope of the project.

3.1.4 Implementation

A significant amount of time has gone into understanding the WRF model and developing a robust implementation allowing for automated runs. In this section, we describe the WRF model implemen-tation in a Docker container, the Python skin developed for automation and the used hardware used for the experiment. Lastly, we describe some issues we encountered and how to solve these.

Docker To help with the installation of the WRF model, we create a Docker container. For the corresponding Docker image, we use the CentOS base image and install all WRF model dependencies following an online tutorial [38] and Dockerfiles published by the NCAR [39]. The use of CentOS 7 is recommended because it allows for easy installation of all WRF model dependencies. After installing the dependencies, we perform various tests [38] and, when all tests pass, we compile the WRF model.

(23)

Python Running the WRF model is a tedious process and requires many individual steps. Conse-quently, setting up a model run is time-consuming and prone to human error. To make it easier and more efficient to run the WRF model, we developed a Python skin to automate the process of running the model. The code takes care of all step otherwise done manually and uses templates and pre-sets to configure the model. It also downloads data needed to initialise the model automatically. Another benefit is that the python skin makes it much easier for new users to run the model as it has a steep learning curve.

Google Cloud Using the container containing the model and all the data, we run the WRF model in a Virtual Machine (VM). The model runs are performed on a VM hosted by the Google Cloud Platform. The used compute engine is the n1-standard-4 type with 4 Intel Xeon CPUs, 15GB RAM and two mounted solid-state drives of 375GB each [40].

Implementation Issues and Solutions One of the model dependencies, OpenMPI, was updated mid-June 2020 and broke the WRF implementation. In an attempt to solve the problem, we both updated and re-installed the individual dependencies within the running container and checked whether using the latest versions of all dependencies would fix the issue. Both attempts failed to solve the problem. To solve the problem, we rebuild the Docker image from scratch.

Later, we encountered another issue that blocked certain WRF runs. Some runs, seemingly random, got stuck and resulted in a segmentation error. It is unclear what was causing the model crash. To solve it, we deleted and rebuilt the Docker image again. After this rebuild, we were able to run all configurations that got stuck before without a problem.

3.2 Optical Flow Baseline

In this study, we set the benchmark with an optical flow model implemented by a previous research intern at Dexter Energy Services [41]. The thesis compares Eulerian and Lagrangian persistence with various optical flow algorithms available in the RainyMotion library [26]. Additionally, the thesis introduces a new model. The model is an ensemble of multiple optical flow algorithms and is computed by taking the mean of the prediction of N optical flow algorithms. The model is defined as follows:

¯ et(x, y) = 1 N N X i=1 pt(x, y) (9)

where ¯et(x, y) is the forecast of the ensemble model at lead time t and at coordinates x, y. The

prediction of the optical flow algorithms is represented by pt. The ensemble model in [41] combines

the predictions of the Farneback, DeepFlow and Dense Inverse Search algorithms. This model was the best performing optical flow based model and is used in this study to set the baseline.

(24)

3.3 Neural Weather Models

In this section, we provide a technical description of the DNN for solar nowcasting [8]. First, we provide a general introduction to the encoder-decoder structure and second, we present the ConvGRU model used to generate predictions. We follow the formulation introduced by [7] and consider irradiance nowcasting a spatiotemporal sequence forecasting problem. A more formal definition is presented in (10), where the input is a sequence of length J containing previous observations and the forecast a sequence of K frames ahead:

˜ It+1, . . . , ˜It+K = arg max It+1,...,It+K pIt+1, . . . , It+K| Ît−J +1, Ît−J +2, . . . , Ît (10) Here, the input and predictions are a sequence containing 2D maps, more formally I ∈ RC×H×W_.

The spatial dimensions of the 2D grid are denoted by H (number of rows) and W (number of columns). In each pixel, there are C time-varying measurements. The model learns by minimising the forecast error through back-propagation and learns spatiotemporal relations without explicit assumptions about atmospheric physics.

In the following, we describe the defining features on the DNN and explain how these allow the model to capture both spatial and temporal correlations from dynamic input data. Lastly, introduce a new model that allows for additional input channels of static data and describe how it combines both static and dynamic data to predict future weather.

Figure 8: Overview of the encoder-decoder structure, where the input consists of two images, I1 and

I2 and the output of two predicted images, ˆI3 and ˆI4. G represents static data concatenated to the

(25)

3.3.1 Encoder-Decoder Structure

For the sequence forecasting problem, we adapt an encoder-decoder sequence-to-sequence structure [42]. This structure maps the input sequence with a fixed length with a fixed-length output sequence and allows the input and output to have different lengths. An example is visualised in Figure 8. First, the encoder processes the elements in the spatiotemporal input sequence and creates a smaller and higher dimensional representation. The dimensionality of this representation is fixed and is called the hidden state. Subsequently, another set of RNNs called the decoder learns to generate the predictions from the hidden state through multiple layers.

The model for this research contains downsampling layers between the RNN cells in the encoder and upsampling layers between the cells in the decoder. Both are implemented by convolutions and deconvolutions with stride, respectively. For the RNN cells, we can implement any RNN. In this research, we use the Convolutional Gated Recurrent Unit (ConvGRU), which we describe in the next section.

3.3.2 Convolutional Gated Recurrent Unit

The Convolutional Gated Recurrent Unit (ConvGRU) model used in this research is introduced by [8], and Dexter Energy Services provides an implementation. In this section, we provide a technical description of the model and describe how the model exploits both spatial and temporal features driving the weather.

Convolutions capture the spatial context of the input frame. Convolutional layers assign importance to certain objects in the image and learn what parts of an input frame are important. Also, these layers can differentiate between different objects in the image.

The Recurrent Neural Network (RNN) units capture the temporal dependencies by learning what previous information in a sequence is important. Standard RNN is best suited for learning short-term dependencies; when learning long-term dependencies, some problems might occur. The gap between relevant information and the point where it is needed may become too large, resulting in the model’s inability to connect the pieces of information. This is called the vanishing gradients problem. To solve this problem, [44] introduced a more sophisticated version of RNN: the Long Short Term Memory network (LSTM). This special RNN structure is capable of modelling long-term dependencies as it learns what past information is important and forgets irrelevant information. More recently, [45] introduced a special kind of the LSTM called the Gated Recurrent Unit (GRU). A GRU cell combines the forget and input gates from the LSTM into a single update gate and merges the cell state and hidden state. Consequently, the GRU model is simpler than the LSTM and more efficient to train. We use Figure 9 to explain the formulas in (11).

The GRU takes two sources of information: the memory state holding information of the previous units denoted with Ht−1 and the information new to the model. The latter is denoted as Xt and is

(26)

a single element of an inputted sequence. The first step in a GRU is to determine what information to pass on into the future using the update gate Zt. Subsequently, the reset gate Rt determines what

information to forget. With the output of the reset gate, the unit computes the new memory content H0

tthrough the activation function f . As the last step, the unit computes the memory state Htpassed

on to the next unit in the network. The corresponding formulas are as follows: Zt= σ (Wxz∗ Xt+ Whz∗ Ht−1) Rt= σ (Wxr∗ Xt+ Whr∗ Ht−1) H0 t= f (Wxh∗ Xt+ Rt◦ (Whh∗ Ht−1)) Ht= (1 − Zt) ◦ H0t+ Zt◦ Ht−1 (11)

where Xt ∈ RCi×H×W and Ht, Rt, Zt, H0t ∈ RCh×H×W. Here C is the number of input channels,

H the height of the input image and W the width. In the formulas, ∗ in the convolutional operation, ◦ the Hadamard product and f the Leaky Rectified Linear Unit (ReLU) activation function with a negative slope of 0.2. The sigmoid activation function is applied to both the update and reset gate.

3.3.3 Model Extension

Recent work in the medical field introduced model architectures that can exploit both static and dynamical features of medical data, such as gender and patient visits [46]. The proposed architectures combine RNNs to process dynamic data with an independent Feed Forward Neural Networks processing the static data. The study finds that the new architectures improve the prediction score of clinical events and are the main inspiration for the model we introduce in the following.

Figure 9: A single Gated Recurrent Unit. The orange circles represent the Hadamard product, σ is the sigmoid activation function and f is a user specified activation function. Figure from [43].

(27)

The weather is also driven by a combination of dynamic and static features. Elevating terrain height, for example, causes orographic lifting and stimulates the formation of clouds. The ConvGRU model discussed above is specifically designed to work with sequence data and does not exploit such data types. This section introduces an extension to the model designed for static data input. The new model architecture exploiting both dynamic and static information is presented in Figure 10.

For the static input data, we develop an independent CNN with ReLU activation functions and batch normalisation, and we adapt the encoder-decoder structure. The encoder part of the network runs parallel to the encoder of the RNN and outputs the hidden state representation of the static information. The hidden state of both encoders is concatenated mid-way through the decoder and provide this information to the last layers, which output the irradiance prediction.

Figure 10: A simplified representation of the introduced architecture. The model consists of two independent networks, one for the static and one for dynamic data. Both data sources are encoded separately and concatenated mid-way through the decoder to generate the predictions.

3.3.4 Input Patch

The ConvGRU model receives a four-dimensional tensor of size [t, c, h, w] with dimensions time, number of channels, height and width. The time dimension of the input patch is equal to 5, and one frame is provided every 15 minutes. The stand-alone ConvGRU model receives a single input channel, and the extended model receives three input channels: the sequence data and static data. The static data features we use in this work are terrain height and land mask.

The resolution of the irradiance data, discussed in Section 3.4.2, is 323 x 323 pixels. To reduce the computational time, we downsample the frames by a factor of three and round to an even number as the dimensions need to be evenly divisible for the convolution operations. The resulting input patch has a resolution of 108 by 108 pixels.

(28)

3.3.5 Implementation

To speed up the training of the ConvGRU model, we containerise the model using Docker and train on a VM. For training and the nowcasting experiments, we use the n1-standard-8 type compute engine with 8 Intel Xeon CPU cores, 15GB RAM, a Tesla T4 GPU with 16GB RAM and two mounted solid-state drives of 375GB each [40].

Implementation Issue and Solution Working with the repository created for this study, we encountered one major issue. This was caused by the version of the Gdal python package, which provides the functions used to reproject the MSG-CPP data (Section 3.4.2). We found that the most recent versions of this library, versions 3.x.x, are not compatible with the current python logic. We recommend using version 2.4.2.

3.4 Data

We next describe the data used for solar nowcasting. First, the data from the global weather model used to initialise the WRF model and second, the data used by the optical flow algorithm, the ConvGRU model and for evaluation.

3.4.1 Global Forecasting System

The Global Forecast System (GFS) a global weather model developed by the National Centre for Envi-ronmental Prediction (NCEP) and provides the initial and boundary conditions of for the WRF model. The model consists of 32 vertical layers and has a horizontal resolution of 0.25◦, which correspond to approximately 28 kilometres. The GFS takes its initial conditions from the Global Data Assimilation System (GDAS) and is run four times a day. The forecasts are provided with 6-hour intervals because it takes close to 6 hours to run. Therefore, the WRF implementation in this study relies on the 00 : 00 UTC run of the GFS model, which is published just before 06 : 00 UTC. Lastly, the GFS model produces 1-hourly forecasts. This allows the WRF model to update the lateral boundary conditions once every hour throughout the forecast.

3.4.2 MSG-CPP

In this work, we use Meteosat Second Generation Cloud Physical Properties, in short MSG-CPP. The CPP algorithm is developed by the KNMI to derive cloud parameters and solar radiation at the surface from the SEVIRI instrument onboard the Meteosat Second Generation satellite [47]. The data is available per 15 minutes and only when the sun is up. The spatial resolution is 3 x 3 km, and the spatial domain spans the area between longitude −50, 50 and latitude −80, 80.

(29)

The estimation of surface solar radiation is based on satellite observations. The CPP algorithm described in [48] considers the solar zenith angle (SZA) and computes the parameters describing the state of the atmosphere, such as cloud optical thickness and aerosol optical thickness. Based on the computed cloud properties, the algorithm derives the direct and diffuse irradiance components.

A limitation of using this particular dataset is that the range of the processed data is shorter than the range of actual sunlight. Initially, MSG-CPP derives cloud properties and based on this estimates the incoming radiation at the Earth’s surface. The algorithm can only derive the cloud properties when the solar angle is not too high, as the estimations otherwise become very inaccurate. In the current version, the cloud properties are computed when the SZA is less than 78◦, consequently missing a part of the day [48]. In the new version, which is not yet operational, the algorithm can derive the cloud properties for a larger SZA, about 84 degrees (Jan Fokke Meirink, KNMI, personal communication, August 2020).

Preprocessing The data is retrieved from a geostationary satellite and spans an area much larger than the studied domain. For the preprocessing, we reproject the data to map projection used by the WRF model, the Lambert Conformal Conic projection. Subsequently, we cut out the domain used for the WRF model simulation.

3.5 Forecast Evaluation

To study the forecast accuracy, we compare the forecasts to actual solar radiation at the surface. The actual irradiance is provided by the MSG-CPP dataset and used for the evaluation of all three methods. The metrics discussed in the following compare the pixel values. The pixel values represent the solar irradiance at the Earth’s surface in W/m2_.

3.5.1 Metrics of Forecast Accuracy

The error metrics used to determine the forecast accuracy of the WRF model are the Mean Bias Error (MBE) and the Mean Absolute Error (MAE). These metrics are simple to compute, have a clear meaning and quantify the strength of the error signal over the whole image.

To study the forecast behaviour and, we compute the MBE. The MBE provides and insight into whether the model is over-estimation of under-estimating the solar irradiance. This metric is presented in (12), where It,observed is the observed irradiance and bIt,forecast the predicted irradiance.

MBE = 1 n n X t=1 b It,forecast− It,observed (12)

(30)

Additionally, we compute the MAE. The metric is defined as follows: MAE = 1 n n X t=1 Ibt,forecast− It,observed (13)

We compute the metrics globally, i.e. over the whole model domain, and using normalised data to account for the irradiance strength varying throughout the day. We normalise the data by dividing with the clear sky solar irradiance from the MSG-CPP dataset.

3.5.2 Image Quality Assessment

Images are highly structured, meaning that the pixels strongly depend on each other. The error metrics discussed above quantify the strength of the errors but fail to account for patterns and textures in the images. For assessing the quality of the predicted images, we next described the Structural Similarity Index (SSIM) [49]. The SSIM assesses the loss of structural information by comparing local patterns and is complementary to the approaches based on errors on the pixel level.

Structures in the images are independent of illumination and contrast. Consequently, SSIM sepa-rates the influence of local illumination to extract the structural information and uses three separate comparisons to measure the similarity between two images: structure, contrast and local luminance (14). All three are computed on local image patches using sliding windows. The function for lumi-nance, l(x, y), measures the relative luminance change. The relative contrast is measured by function c(x, y) and s(x, y) is the structure comparison function.

l(x, y) = 2µxµy+ C1 µ2 x+ µ2y+ C1 , c(x, y) = 2σxσy+ C2 σ2 x+ σy2+ C2 , s(x, y) = σxy+ C3 σxσy+ C3 (14)

Here, the x and y inputs are images with non-negative pixel values. µ and σ are the sample mean, and standard deviation, σxy is the cross-correlation. The C parameters are constants added

to avoid numerical instability. All three components are relatively independent. For example, if the structures within an image change, it will not cause changes to the luminance and contrast in the image. Subsequently, all three comparison functions are combined into a single metric: the structural similarity index. This function is defines as follows:

SSIM(x, y) = f (l(x, y), c(x, y), s(x, y)) = (2µxµy+ C1) (2σxy+ C2) µ2

x+ µ2y+ C1 σ2x+ σy2+ C2

(15)

Finally, all local SSIM values are averaged, resulting in a single value for the image comparison. We exemplify how various modification affect the SSIM in Figure 11. In the figure we see how different modifications affect the MSE and the SSIM. Random noise and the subtraction of a constant result

(31)

(a)

(b)

Figure 11: Image of clouds. In (a) two modifications are applied to the image, one adds noise and the other subtracts a constant. Both result in identical MSE and in different SSIM. The addition of a constant more strongly preserves the structures in the image than the addition of noise. In (b), we move the cloud left and right within the image and flip it to exemplify how these modifications affect the SSIM. The structures are identical, but in a different location or different orientation and have a negative impact on the SSIM while keeping the MSE very small or equal to zero.

in the same mean squared error, but have a different effect on the SSIM. The addition of random noise does not preserve the structure of the objects in an image and results in a very low SSIM. The structure is mostly preserved when a constant is added or subtracted from all pixel values, represented by a much higher SSIM. This mean that is case a model predicts the shape and location of clouds well, but predicts stronger of weaker irradiance, the SSIM is still relatively high. In the bottom images, we see how a shift in location affects both metrics. The same objects in a different location or with a different orientation result in a very little or no change in the MSE, while the SSIM does capture these modifications. Concluding, the SSIM contains information on the structure and location of the predicted clouds is can therefore be used to complement the global error metrics.

(32)

4 Experimental Setup

We next describe the setup of the individual experiments and the methods used to compare the various models. For all experiments, we use the same forecasting window of 0 to 6 hours and output a forecast at 15-minute intervals. Also, we generate predictions over the same domain and the period over which we generate forecasts is from July 18 until October 31 2019. The reason for this is that the GFS model received an update mid-2019 and since then provides hourly forecast instead of three-hourly forecasts [50]. Dexter Energy Services saved the data from July 2018 onward. Also, we can only validate 6-hour forecast until September 30 as the range of the MSG-CPP is too short after that, as discussed in Section 3.4.2.

The outline of this chapter is as follows. First, we describe the WRF model setup, which contains all practical information on how to run the model for solar nowcasting. After that, the optical flow and deep neural network experiments are described. Lastly, we explain how we measure the models’ performance and how we compare the output of all three methods.

4.1 WRF Model

For the setup of the WRF model, we rely on general recommendations, various tutorials, trial and error and literature on nowcasting irradiance with the model. The setup is done through two main initialisation files called namelist files. One provides the configurations of the preprocessing system and the other for the configuration of the WRF model.

Based on the configurations in the first namelist, the preprocessing system derives the initial condi-tions and the lateral boundary condicondi-tions form the GFS data discussed in 3.4.1. These lateral boundary conditions are updated every hour throughout the forecast, as GFS provides hourly forecasts. The geographical area and model domains are also configured within the first namelist file. The domain setup and the case study used for this study are presented in Figure 12 and we provide the technical description in the next section. In Section 4.1.2, we discuss the configuration of the second namelist file containing settings for the WRF model.

4.1.1 Case Study Area and Model Domain

The case study area for this research is a geographical area corresponding to the Netherlands and small parts of Germany and Belgium. The area is centred in Amsterdam at latitude 52.4◦and longitude 4.9◦. The country spans an area close to 42 km2_{, of which 18% is water. The climate in the Netherlands}

is a temperate maritime climate and is influenced by the North Sea and the Atlantic Ocean. Clouds appear almost every day, and the weather is quite unpredictable. Types of weather can vary a lot, even within a day.

(33)

For the WRF model experiments, we set up two model domains. The outer and coarser domain takes initial conditions and lateral boundary conditions from the GFS input data. The primary function of this parent domain is to bridge the gap between the coarser input data from a global weather model to the finer resolution of the domain of interest. The parent domain provides the nested domain with the lateral boundary conditions throughout the simulation. Important to note is that the nested domain has to adapt to the finer resolution. Consequently, the weather simulations in grid cells near the border of the nested domain can be inaccurate. To account for this behaviour, we ensure the nested domain covers an area slightly larger than the Netherlands. Lastly, we implement the two-way nest option allowing for feedback between the nests.

The domains have a horizontal spatial resolution of 27 and 9 kilometres respectively, following [6]. The ratio is also consistent with the general advice to keep the ratio between the nested and parent domain equal to three [21]. The parent domain is composed of 60 × 60 cells, and the finer inner domain is composed of 70 × 70 grid cells. In both domains, the model is made up of 36 vertical layers and four soil layers. The domain size is limited to reduce the computational time. The model domains specifications are summarised in the Appendix. The Lambert conformal conic projection is used to define the two-dimensional grid as it is best suited for mid-latitudes. This projection ensures that directions, angles and shapes are maintained, minimising distance distortions in the chosen domain.

For NWP model, time is discretised and thus divided into steps. When specifying the time-step for the experiments, it is essential to consider the horizontal resolution. The time-time-step is to comply with the resolution of the model domain, and a general rule of thumb is to use a time-step of 4 to 6 times the horizontal resolution in kilometres [21]. We settled at a time-step of 108 seconds, four times 27 km, as larger time-steps caused instability.

Figure 12: Domain decomposition of the WRF model. The outer domain has a horizontal resolution of 27 km and spans an area of 60 × 60 grid cells. The size of the inner domain is 70 × 70 grid cells and has a horizontal resolution of 9 km.

Improving irradiance forecasting to make solar energy better

University of Amsterdam

Masters Thesis