Improvement of operational flood forecasting through the assimilation of satellite observations and multiple river flow data

Data assimilation has the potential to improve flood forecasting. However, it is rarely employed in distributed hydrologic models for operational predictions. In this study, we present variational assimilation of river flow data at multiple locations and of land surface temperature (LST) from satellite in a distributed hydrologic model that is part of the operational forecasting chain for the Arno river, in central Italy. LST is used to estimate initial condition of soil moisture through a coupled surface energy/water balance scheme. We present here several hindcast experiments to assess the performances of the assimi5 lation system. The results show that assimilation can significantly improve flood forecasting, although in the limit of data error and model structure.


Introduction
The potential of data assimilation in hydrology has been demonstrated by several studies (e.g., Clark et al., 2008;Seo et al., 2009;Brocca et al., 2010;Lee et al., 2012;Laiolo et al., 2015).However, the usage of data assimilation in distributed hydrologic models for operational flood forecasts is limited by many issues: non-linear and discontinuous model structure, non-Gaussian/multiplicative errors, large dimensionality of the inverse problem, model governed by different equations, complex topology of domains such as surface drainage and river network.Moreover, the majority of studies investigates capabilities of data assimilation through synthetic experiments, while applications conducted from an operational perspective are rare, although the need for an effective transition of research advances into operational forecasting systems has been increasingly claimed in recent years (Liu et al., 2012).This work presents variational assimilation of flow data at multiple locations and of land surface temperature (LST) maps from satellite in a distributed hydrologic model that is part of the operational forecasting chain for the Arno river, in central Italy.We assess the actual gain that can be obtained in flood predictions through data assimilation in order to answer the unsolved doubts of whether the system has enough long memory, or hydrologic models are sufficiently adherent to reality, to significantly benefit from more accurate initial conditions.We show results from several hindcast experiments in the Arno basin.

MOBIDIC (MOdello di Bilancio Idrologico DIstribuito e
Continuo) is a physically-based hydrologic model (Campo et al., 2006;Castelli et al., 2009;Yang et al., 2014a, b).It is continuous in time and distributed in space, with rasterbased horizontal discretization.MOBIDIC solves a coupled mass and energy balance at the surface and employs a computationally efficient representation of soil moisture dynamics, that has been recently improved in Castillo et al. (2015).It has a single layer scheme for soil, whose peculiarity is the conceptual subdivision of the layer into two non-linear reservoirs, the capillary and gravitational one.They correspond to larger pores that drains under gravity and smaller pores that hold water through capillary forces.The two components control different sets of hydrologic fluxes.In particular, available water for evapotranspiration is capillary water, that is fed by gravitational water through an absorption flux.Interactions between surface and subsurface hydrology are explicitly taken into account.Groundwater dynamics may be modeled through 2-D Dupuit approximation or as a linear Published by Copernicus Publications on behalf of the International Association of Hydrological Sciences.
reservoir.The latter method is employed in the present work.Three options are available for flow routing through the network, i.e. the lag approach, the Muskingum-Cunge method and the cascade of liner reservoirs.MOBIDIC runs operatively at the hydrologic service of Tuscany region (Servizio Idrologico Regionale, Regione Toscana) for floods forecasting and water resources management purposes.

The assimilation scheme
For the assimilation system the variational approach has been selected, since it requires less restrictive hypothesis than Kalman and Montecarlo filters and smoothers.The payback is the need for an adjoint model, that is challenging to derive for distributed hydrologic models.Separate adjoint models have been derived for MOBIDIC's modules of flow routing through the river network and of water and energy balance at the soil surface.The first one is devoted to the assimilation of discharge data at multiple locations, while the second one deals with LST observations from satellite.

Adjoint model of flow routing in the river network
The adjoint model of flow routing has been developed for the method of linear reservoirs in cascade, which represents the optimal compromise between complexity and representativeness of the physical process.Hence, the routing is driven by: where, considering a network composed by n reaches, Q ∈ R n are the discharges exiting each reach, q L ∈ R n are the lateral inflows (surface runoff plus groundwater flow), and A ∈ R n×n is a diagonal matrix with the inverse of the characteristic time of each river on the diagonal.Lastly, U ∈ R n×n is a binary matrix accounting for network topology.Following a classical variational approach (Castelli et al., 1999;Caparrini et al., 2004), the assimilation scheme provides optimal estimates of discharge initial condition and input by minimizing a penalty functional J .J contains squared errors between states predictions and observations and between current and previous values of the quantities to optimize.The physical constraint is imposed by adjoining Eq. (1) to J through a vector of Lagrange multipliers λ ∈ R n .
In Eq. ( 2), K Q , K Q 0 , K q L ∈ R n×n are weighting factors applied to the various terms composing J , Q obs ∈ R n are the observed discharges available inside the assimilation window [t 0 , t 1 ], the sign (•) indicates the previous value of the quantity.J is minimized if its first variation δJ vanishes, condition that, after some computations, leads to a system of ordinary differential equations that describes the time evolution of Lagrange multipliers.Furthermore, a terminal condition for the backward integration of the adjoint model and update equations for initial streamflow Q 0 and lateral input q L that depends on λ are obtained.The optimal estimate of Q 0 and q L is obtained through an iterative procedure constituted by subsequent integrations of forward and adjoint model and corresponding updates.Iterations are interrupted when updates become negligible.To note that the corrections evaluated at discharge measurement stations spread upstream thanks to the coupling between equations of flow channel routing (see Eq. 1).Since our aim is to improve both flow routing and runoff formation processes, the equations driving these latter should be included in the derivation of the adjoint model.However, it would be an extremely challenging task, mainly because of the threshold processes that characterize soil moisture dynamics.Therefore, we employ a mixed variational-Montecarlo approach.First, through the flow routing adjoint model, we estimate the optimal temporal evolution of lateral inflow q L .Then, on its basis, we infer key variables that determine runoff formation through a parsimonious Montecarlo approach.As key variables, we selected initial condition of water content in capillary soil and rainfall intermittence, that is represented by the parameter f 0 (frequency of no-rainfall during a certain time step, see Castelli, 1996).An ensemble covering the extent of the assimilation window is generated for q L by reasonably varying initial soil moisture and f 0 .Both initial capillary water and f 0 are maintained spatially homogeneous, and hence the size of the ensemble remains small (typically around 100 realizations).At each iteration of the assimilation procedure, the realization with the minimum distance from the desired trajectory of q L is selected for any single reach, and the corresponding initial capillary water and f 0 are adopted for the contributing cells.Hence, a spatially distributed estimate of both quantities is obtained.

Adjoint model of soil water and energy balance
As described in Sect.2, MOBIDIC solves a coupled water and energy balance at the surface.Hence, it reproduces the partitioning of available energy between latent and sensible Proc.IAHS, 373, 167-173, 2016 proc-iahs.net/373/167/2016/heat flux that is determined by soil moisture.This characteristic allows to estimate soil moisture initial condition through the assimilation of LST observations from satellite.The advantage of assimilating LST maps instead of soil moisture products lies in their higher spatio-temporal resolution and accuracy (Campo et al., 2012).Among MOBIDIC's driving equations, a set of three ODEs can be identified as significant for the assimilation of LST.The system is here reported in synthetic form to highlight the relevant dependencies.
where H and LE are surface turbulent heat fluxes of sensible and latent heat respectively, T s is the land surface temperature, T d is the temperature of a deeper layer of soil, W c and W g are capillary and gravitational soil water content.The system can be solved pixel by pixel, being the states of a specific cell independent from those of the others.In the developed assimilation framework T s , W c and W g are the analyzed states, whose optimal estimation is obtained by minimization of the penalty functional J : where all the state variables are vector in R n , being n the number of cells composing the basin.K T s , K T d and K W c ∈ R n×n are weighting factors applied to the various terms of the penalty functional, T obs s ∈ R n are measurements of LST available inside the assimilation window [t 0 , t 1 ] and the sign (•) indicates the previous value of the quantity.The penalty functional is formed by quadratic errors in respect to both measurements and previous values of the quantity to estimate.The last term is the physical constrain adjoined through Lagrange multipliers λ 1 , λ 2 and λ 3 ∈ R n , each one corresponding to a specific state variable.J is minimized when its first variation vanishes, condition that, after some computations, leads to ODEs driving the time evolution of λ 1 , λ 2 and λ 3 .Terminal conditions for their backward integration are also obtained, as well as update equations for initial condition of T s , T d , W c and W g that depend on Lagrange multipliers.Forward and adjoint model are solved iteratively and corresponding updates are computed.Iterations are interrupted when updates become negligible.

Application: flood forecasts in the Arno river basin
The assimilation scheme is tested in the Arno river basin, central Italy.The basin extends over about 8300 km 2 .Figure 1 shows Digital Elevation Model, river network and flow measurement stations employed in this work.Flood forecasting is a relevant issue in Arno basin, since Arno passes through major Tuscan cities, as Florence and Pisa.The performances of the assimilation system are assessed through several hindcast experiments.Simulations are run with the spatial and temporal resolutions that are employed operationally, i.e. 500 m and 15 min.The analysis is performed mainly in terms of flow peak prediction accuracy, since it is one of the most important skills for floods early warning.

Results from assimilation of flow data
Experiments devoted to test assimilation of multiple flow data include both high flow and false alarm (high rainfall but low flows) events that occurred in the period 2009-2014.Flow observations are available from the 5 measurement stations shown in Fig. 1, with a temporal resolution of 15 min.Sequential assimilations are realized on windows of 6 h, employing the data from all the stations simultaneously.Analysis of initial discharge in the river network and capillary water in soil, as well as the optimal estimate of the parameter f 0 , are obtained for each assimilation window, and then used to run the corresponding prediction simulation.To evaluate the assimilation system we focus on results at S. Giovanni alla Vena, that is the closest station to the outlet and can be considered as representative of the overall functioning of the scheme.Figure 2 summarizes the results obtained for all the 16 examined events.Peak flows at S. Giovanni alla Vena are plotted against the corresponding total volume of rainfall.
Observations and open loop are marked with black stars and white circles respectively.Forecasts from data assimilation are white triangles and gray circles.These latter correspond to the last assimilation window, that includes observed peaks at the upstream locations (Subbiano and Montevarchi, and in some cases also Nave di Rosano), while gray circles are from an antecedent window.The general behavior is, as desired, of enhanced flow peak predictions from data assimilation in respect to open loop.Gray circles show a remarkable better adherence to observations than white triangles, suggesting that the gain increases significantly when upstrem peaks are included in the assimilation window.However, in some experiments data assimilation has a slightly negative impact on predictions.This occurs when open loop is already very close to the observations.To give further insight, hydrographs at S. Giovanni alla Vena for three events are reported in Figs. 3, 4 and 5.They  are characterized by different levels of performance.In the false alarm event of 6 April 2010 (Fig. 3), forecasts progressively improve as the assimilation window advances in time.Predictions corresponding to the last window match observations almost perfectly.Conversely, in the flood event of 13 November 2012 (Fig. 4), data assimilation initially reduces the adherence of the forecasted hydrograph to observations, especially in terms of peak flow.The reason is that assimilation attempts to lower streamflow, that in open loop is significantly overestimated during the first window.Improvements are obtained in the third step, whose window cor- responds to the actual beginning of the event rising limb.It results into an excellent reproduction of data, that is slightly deteriorated by the subsequent assimilation.The final predicted peak is affected by an error almost equivalent to that of the open loop (about 8.4 %), although overestimation instead of underestimation.The behavior for the flood event of 18 November 2014 (Fig. 5) is intermediate.Predictions from data assimilation are always better than open loop.However, a progressive improvement as in the false alarm event of 6 April 2010 is not observed.For instance, the second assimilation slightly worsen the forecasts corresponding to the first one.The subsequent assimilation recovers the gap and predictions from the last window are definitely a significant enhancement of the open loop.
In summary, the assimilation scheme can considerably improve flood forecasts and reduce false alarms.Nevertheless, in presence of already accurate prediction, the assimilation system can not provide additional enhancement.This fact indicates that the gain obtainable through the assimilation scheme is limited by model structure, namely, model errors can not be overcome by the assimilation, as well as by possible errors in data.Conversely, an exacerbate forcing of modeled outputs toward observations can lead to a negative impact on the forecasts.

Results from assimilation of LST observations
Assimilation of LST observations is tested on a late summer event (16-18 September 2006) characterized by locally heavy rainfall (up to 250 mm) but quite low streamflow values.The employed LST observations are from MeteosatSG-SEVIRI (LandSAF product at 15 min with a spatial resolution of about 3 km).Assimilation is performed for 13 September in order to evaluate a proper initial soil  moisture before rainfall starts.Figure 6 shows soil moisture at the end of 13 September for the open loop and its analysis obtained through data assimilation.Figure 7 compares streamflow observations at Nave di Rosano with hydrographs from simulations initialized with a complete dry soil, soil saturation level equal to 50 % of field capacity and soil moisture estimated from the assimilation of LST.The peak flow is well reproduced by the latter, while half-field capacity run significantly overestimates discharges and the dry run suffers underestimation.Figure 8  other stations.Dry and assimilation run are closer to observations (black stars), with the latter better predicting the peak at some locations.Nevertheless, model results do not manage to reproduce the variability in the dependence of flow peak from upstream precipitation that characterizes observations.This fact reveals that the assimilation system does not add sufficiently flexibility in the model, although it improves the estimate of initial soil moisture.However, tests on several events would be necessary to properly evaluate LST assimilation.

Conclusions
This work presents variational assimilation of flow data at multiple locations and of LST in the distributed hydrologic model MOBIDIC, that is part of the operational forecasting chain of the Arno river in central Italy.Flow data are employed to optimally estimate initial discharge in the river network, initial capillary water in soil and rainfall intermittence parameter f 0 .LST is exploited to infer initial soil moisture (capillary and gravitational).The performances of the developed assimilation system are assessed in several hindcast experiments.In particular, the scheme for assimilation of flow data is tested through 16 experiments.The scheme produces relevant improvements in flood forecasting, especially when the assimilation window includes peak flows at upstream locations.Some negative impacts of the assimilation are observed in case open loop results are already very close to measurements, suggesting that the obtainable gain is limited by the structure of the hydrologic model and flow data errors.The scheme for LST assimilation is verified on one single event with locally heavy rainfall but low streamflows.LST is assimilated 2 days in advance in respect to the event.Com-parison of simulation results with observations of flow at various locations shows that the initial condition of soil moisture estimated through LST assimilation improves model performances in respect to dry soil and half-field capacity runs.However, the assimilation does not add sufficiently flexibility in the model to fully reproduce the observed variability in runoff formation process.Future work will analyze in more detail the potential of LST assimilation by performing additional hindcast experiments.

Figure 1 .
Figure 1.DEM and river network of Arno basin used in the simulations.Black circles indicate measurement stations available for experiments of flow data assimilation.

)Figure 3 .
Figure 3. Hydrographs at S. Giovanni alla Vena measurement station for the false alarm event of 6 April 2010.Flow observations are gray dots, open loop simulation is the solid line, predictions using analysis obtained by assimilating flow data during the 1st, 2nd, 3rd and 4th assimilation window (Assim.Win.) are the dotted, dashed, dash-dot and solid thick line respectively.

Figure 4 .
Figure 4. Hydrographs at S. Giovanni alla Vena measurement station for the flood event of 13 November 2012.Lines, symbols and acronyms maintain the same meaning of Fig. 3.

Figure 5 .
Figure 5. Hydrographs at S. Giovanni alla Vena measurement station for the flood event of 18 November 2014.Lines, symbols and acronyms maintain the same meaning of Fig. 3.

Figure 6 .
Figure 6.Soil moisture maps of background (a), analysis (b) and increment between them (c) at the end of the assimilation window.Values are expressed as saturation level (%).

Figure 8 .
Figure 8. Flow peak versus precipitation volume at the available measurement stations for the test event of LST assimilation.Acronyms in the legend maintain the same meaning of Fig. 7.
at Nave di Rosano measurement station for the test event of LST assimilation.Flow observations are gray dots, predictions initialized with a complete dry soil, soil saturation level equal to 50 % of field capacity, and soil moisture estimated from the assimilation of LST, are the solid, dashed and solid thick line respectively.