Inflow forecasting using artificial neural networks for reservoir operation

In this study, multi-layer perceptron (MLP) artificial neural networks have been applied to forecast one-month-ahead inflow for the Ubonratana reservoir, Thailand. To assess how well the forecast inflows have performed in the operation of the reservoir, simulations were carried out guided by the systems rule curves. As basis of comparison, four inflow situations were considered: (1) inflow known and assumed to be the historic (Type A); (2) inflow known and assumed to be the forecast (Type F); (3) inflow known and assumed to be the historic mean for month (Type M); and (4) inflow is unknown with release decision only conditioned on the starting reservoir storage (Type N). Reservoir performance was summarised in terms of reliability, resilience, vulnerability and sustainability. It was found that Type F inflow situation produced the best performance while Type N was the worst performing. This clearly demonstrates the importance of good inflow information for effective reservoir operation.


Introduction
The planning of reservoirs for various purposes including flood and drought control relies on the historic inflow data at the reservoir site. Due to natural variability and other factors (e.g. climate and land-use changes), however, the inflow situation when the reservoir is being operated will be different. It is therefore important that reservoirs are properly operated so that they continue to perform satisfactorily during changing hydro-climatology.
Reservoir operation concerns taking decisions on water release from a reservoir based on the amount of water available vis-à-vis the demand placed on the system. The available water is the sum of starting period storage and the inflow expected during the period. Consequently, effective reservoir operation relies on reliable forecast of the inflow into the reservoir. Traditional forecasting methods using hydrologic, hydraulic and time-series models require specification of the functional relationship of the model which can be problematic (Zhang et al., 1998), which is why focus has recently shifted to the use of data-driven techniques that do not require knowledge of this functional relationship. In particular, artificial neural networks (ANN) have been widely used to forecast reservoir inflows (see e.g. Edossa and Ba-bel, 2012;Mohammadi et al., 2005) due to their effectiveness and flexibility and have been proven to be superior to other approaches such as regression-based and time series models.
The aim of this study is to apply multi-layer perceptron (MLP)-ANN for the one-month-ahead inflow forecasting for the Ubonratana reservoir, Thailand. To investigate the effect of the forecasts on reservoir operation performance, four situations were considered for the one-month-ahead inflow: (1) inflow is known and assumed to be the historic (Type A); (2) inflow is known and assumed to be the ANN forecast (Type F); (3) inflow is known and assumed to be the historic average for the given month (Type M); and (4) inflow is not known and the release decision is conditioned only on the starting reservoir storage (Type N). Simulations of the Ubonratana reservoir were then carried out with these alternative inflow scenarios and the resulting reservoir performance was summarised in terms of reliability, resilience, vulnerability and sustainability.
In the next section, further details about the methodology will be given. This is then followed by the presentation of the case study. Next the results are presented and discussed and finally, the main conclusions are given.
Published by Copernicus Publications on behalf of the International Association of Hydrological Sciences.

Artificial neural networks modelling
The theory and mathematical basis of ANN have been described excellently by Shamseldin (1997). Essentially, the structure of ANN comprises an input layer, an output layer and one or more hidden layers as illustrated in Fig. 1. The schematic in Fig. 1 has a single hidden layer which is generally sufficient to approximate any complex, non-linear function (Mulia et al., 2015). The layers contain nodes or neurons which are connected by weights. Determining optimal values for these weights and other parameters of the network is the purpose of the ANN training exercise.
For a given problem, the number of nodes in the output layer is fixed by the problem, e.g. in the current work, it is the 1-month ahead inflow forecast. The input nodes must be determined by the factors known to affect the output variable and this has been achieved through an examination of the cross-correlation matrix (see Adeloye and De Munari, 2006). The number of neurons in the hidden layer is much more difficult to arrive at and is normally determined as part of the training by trial and error as described by Adeloye and De Munari (2006).
Training is often improved through the use of early-stoprule (ESR) that helps to avoid over-fitting. In ESR, the available data are divided into three parts: (i) a training set, used to determine the network weights and biases, (ii) a validation set, used to estimate the network performance and decide when the training should be stopped, and (iii) a test set, used to verify the effectiveness of the stopping criterion and to estimate the expected performance in the future.
The tested ANN architectures (in trying to arrive at the best value for the number of hidden neurons) were compared using the correlation coefficient (R) criterion, i.e.: where y sim and y obs are respectively the simulated and observed values of the output variable and N is the number of exemplars used.

Reservoir performance simulation
Reservoir behaviour simulation employed the mass balance equation (McMahon and Adeloye, 2005): subject to the operational policy for the reservoir, where S t and S t+1 are respectively storage at the beginning and end of time t; Q t is the inflow to the reservoir during t; E t is the net evaporation (evaporation minus direct rainfall) in period  t; D t is the total water release towards meeting the target demand of D t during t.
As noted previously, the water available for allocation during t, WA t , is: and assumes that the inflow is known at the start of the month when making the release decision. In practice, however, this is not the case and assumptions about the size of the anticipated inflow must be made. If the actual inflow turns out to be exactly the same as the assumed inflow, then the end of period storage will be exactly as given by Eq.
(2). If, however, there is a discrepancy, the actual end of period storage will be different from Eq. (2). Let the actual end-of-period storage be S end,t , the relationships between this and S t+1 for each of the assumed inflow knowledge assumptions become: 1. Type A: WA t = S t + Q t and S end,t = S t+1 2. Type F: WA t = S t + Q t and S end,t = S t+1 + Q t − Q t 3. Type M: WA t = S t + Q t and S end,t = S t+1 + Q t − Q t 4. Type N: WA t = S t and S end,t = S t+1 + Q t where Q t is the observed (correct) inflow during time t, Q t is the corresponding forecast inflow, Q t is the historic mean flow for the month of time t, and S end,t is the adjusted endof-period storage.
With the available water determined, release then takes place guided by the rule curves as follows: Case 1: Case 3: For WA t ≤ LRC m this is the deficit operation case, i.e., D t = 0 (No water released), where URC m is the up-per rule curve during month m (= 1, 2, 3, . . . , 12) of the year; LRC m is the lower rule curve during month m; Y t is the excess water released during period t. In general, t = 12(y − 1) + m for years y = 1, 2, 3, . . . , n, where n is the number of years in the data record.
Once the simulation is complete, performance indices are then evaluated as follows (McMahon and Adeloye, 2005): i. Time-based Reliability (R t ): R t = N s /N, where N s is the total number of intervals out of N that the demand was met.
ii. Volume-based Reliability (R v ): (R t φ (1 − η)) 1/3 , where there are multiple users or sectors, each of the above indices will be evaluated for each sector and these can later be combined to determine a weighted group (or global) index. This was done for the sustainability index λ using: where w j is a weight, given by (Sandoval-Soils et al., 2011): and λ G is the group sustainability; λ j is the sustainability for users category j ; w j is the weighting for user j ; M is the total number of users sectors and DS j is the average annual water demand for users sector j .

Study area and data
The Ubonratana reservoir is the largest, single multi-purpose reservoir in the upper Chi River Basin in north-eastern Thailand. The dam provides water for consumptive uses (domestic, industrial, irrigation), Pong River in-stream flow augmentation as well as flood control (EGAT, 2002). However, the Data collected for the study included daily reservoir inflows, evaporation, area-height-storage relationship, weekly and monthly water requirements and operating rule curves for the reservoir. The observed monthly inflow from April 1970 to March 2012 and rainfall from April 1981 to March 2012 were provided by the Electricity Generating Authority of Thailand (EGAT) and the Royal Irrigation Department (RID). The analysis, however, used the overlapping period of April 1982 to March 2012 (i.e. 360 months) for which the rainfall and runoff data were complete. Data on historical water releases to the various sectors were also provided by the RID. The gross water requirements for the analysis period were 28 952 Mm 3 , i.e. average monthly of: 0.98 Mm 3 for public (municipal and industrial) demands; 18.83 Mm 3 for downstream requirements; and 60.6 Mm 3 for irrigation. The original rule curves were also provided by the EGAT; the improved versions of these (see Fig. 2) developed by Chiamsathit et al. (2014) were used in the current study.

ANN inflow forecasts
Based on extensive testing involving the examination of the auto-correlation function (acf - Fig. 3a), partialautocorrelation function (pcf - Fig. 3b) and cross-correlation function (ccf - Fig. 3c), six input variables (i.e. current month historic mean inflow, lagged inflows (t − 1, t − 2, t − 3), and lagged rainfall (t − 1, t − 2)) were used for the ANN modelling. The acf (Fig. 3a) shows infinite attenuation with only the first three lags of inflow being significant. Additionally,  the ccf in Fig. 3c indicates that the first two lags of the rainfall are significant. With these, the functional form of the forecast model becomes: where Q t is the one-month ahead inflow forecast; Q t−1 , Q t−2 and Q t−3 are lagged inflows of one-month, two-month and three-month, respectively; R t−1 and R t−2 are lagged rainfall of one-month and two-month, respectively; and Q t is historic mean inflow for the current month. The ESR was used for the ANN training and for this the 360 months of data were split into three (90 : 5 : 5) for training, validation and testing, respectively. The number of hidden neurons was varied between 1 and 35 and based on the R criterion the best architecture had 33 neurons in the hidden layer. Indeed, the final model performed very well with the R exceeding 0.9 in each of the training, validation and testing. Figure 4a, b and c compare the predicted and observed inflow during training, validation and testing, respectively  and further confirm the good performance of the forecasting model. The time series of the forecast inflows (April 1982to March 2012 are also compared in Fig. 5 and this together with the estimated Nash-Sutcliffe efficiency (NSE) of 0.75 is further evidence of the efficacy of the forecasting model. Additionally, the fact that the NSE was higher than zero is an indication that the model has been a better predictor than the mean value of the observed time series.

Reservoir performance evaluation
The results of the performance evaluation are summarised in Table 1. For convenience, the operating policy with Type A, Type F, Type M and Type N are denoted by P-A, P-F, P-M and P-N, respectively.
As seen in Table 1, in terms of the total amount of water released, P-A, P-F and P-M were significantly better than P-N, which is not surprising given that P-N did not have any  additional water from inflows. In terms of reliability (R t and R v ), the P-F was marginally better than using P-A and significantly better than P-N; P-F was, however, inferior to P-M. A possible reason for this is that in some of the months, the historic monthly mean and forecast inflows were higher than the actual inflows, implying that more water will be released in those months with P-M and P-F than with the other two inflow situations. However, the net effect of such large releases (based on the upwardly-biased inflow forecasts) is the increased number of excursions of the end-of-period storage (S end,t ) into the region below the LRC as shown in Table 1 for both the P-F and P-M. The other performance indices reported in Table 1 all reveal the superiority of P-F relative to the other inflow situations. For example, the group sustainability index for P-F was the highest of all four; indeed, the same better performance of P-F was recorded across all three (public, instream and irrigation) demand sectors supplied by the reservoir. As expected, the conservative nature of P-N resulted in the least number of excursions below the LRC. This is likely to benefit the hydro-power generation potential of the reservoir albeit, as revealed by this study, at the expense of its performance in meeting the consumptive demands.

Conclusion
This study has developed MLP-ANN model to forecast onemonth-ahead inflow for the Ubonratana reservoir in northeastern Thailand. Extensive testing of the model showed that it was able to provide inflow forecasts with reasonable accuracy. The performance of the ANN forecasts was tested against those of three other inflow scenarios and the reservoir simulation results showed that the ANN forecasts produced superior reservoir performance. The worst performing inflow situation was when there was complete lack of knowledge about the inflow and release decision was based on the start-ing storage alone. All this represents an objective demonstration of good inflow forecast knowledge for effective reservoir operation.