Visualising DEM-related flood-map uncertainties using a disparity-distance equation algorithm

The apparent absoluteness of information presented by crisp-delineated flood boundaries can lead to misconceptions among planners about the inherent uncertainties associated in generated flood maps. Even maps based on hydraulic modelling using the highest-resolution digital elevation models (DEMs), and calibrated with the most optimal Manning’s roughness (n) coefficients, are susceptible to errors when compared to actual flood boundaries, specifically in flat areas. Therefore, the inaccuracies in inundation extents, brought about by the characteristics of the slope perpendicular to the flow direction of the river, have to be accounted for. Instead of using the typical Monte Carlo simulation and probabilistic methods for uncertainty quantification, an empiricalbased disparity-distance equation that considers the effects of both the DEM resolution and slope was used to create prediction-uncertainty zones around the resulting inundation extents of a one-dimensional (1-D) hydraulic model. The equation was originally derived for the Eskilstuna River where flood maps, based on DEM data of different resolutions, were evaluated for the slope-disparity relationship. To assess whether the equation is applicable to another river with different characteristics, modelled inundation extents from the Testebo River were utilised and tested with the equation. By using the cross-sectional locations, water surface elevations, and DEM, uncertainty zones around the original inundation boundary line can be produced for different confidences. The results show that (1) the proposed method is useful both for estimating and directly visualising model inaccuracies caused by the combined effects of slope and DEM resolution, and (2) the DEM-related uncertainties alone do not account for the total inaccuracy of the derived flood map. Decision-makers can apply it to already existing flood maps, thereby recapitulating and re-analysing the inundation boundaries and the areas that are uncertain. Hence, more comprehensive flood information can be provided when determining locations where extra precautions are needed. Yet, when applied, users must also be aware that there are other factors that can influence the extent of the delineated flood boundary.


Background
In a time of climate change, many countries now require production of flood risk maps when planning and managing built-up areas.Hence, numerous maps have been produced and many of them constitute important documents to hinder or mitigate the negative consequences big floods may bring with them.However, as the maps usually show a potential flood inundation area corresponding to a water discharge so big it has never been experienced before, it is virtually im-possible to get an exact, or even near, match between the model output and the future actual flood.This calls for sensitivity and uncertainty modelling (cf.for example Merwade et al., 2008;Pappenberger et al., 2008, for general treatises on these issues).Among the different methods Monte Carlo simulations and probabilistic methods have been commonly used for analysing uncertainties in flood models (e.g., Pappenberger et al., 2005;Werner et al., 2005a).The techniques are applied to test the sensitivity of the results, in terms of the produced water surface elevation, discharge and the flood's spatial extent, to hydraulic model inputs by randomly alter-Published by Copernicus Publications on behalf of the International Association of Hydrological Sciences.
ing the parameters or parameter sets used.The robustness of the model is often attributed to the number of realisations (from a few hundreds to thousands) that the models are fed with, which in return are used for computing statistics that can quantify the uncertainty.
Provided that the correct water discharge has been selected, examples of issues causing uncertainties include: ground-and river bottom friction parametrisation (Schumann et al., 2007;Werner et al., 2005b), buildings (Koivumäki et al., 2010), cross-section spacing (for 1-D models) (Castellarin et al., 2009;Cook and Merwade, 2009), the modeller's and end users' skills and competences, etc.Nevertheless, the quality of the digital elevation model (DEM) has been shown to have a profound effect on the correctness of the inundation boundary delineation, something which will be looked into deeper in this paper.

DEM-related uncertainties
The influence of DEM resolution on flood modelling accuracy has been acknowledged by a number of authors (e.g., Brandt, 2005;Casas et al., 2006;Cook and Merwade, 2009).Unlike input sensitivities, the quantification of ambiguities produced by the DEM are most often represented by comparisons of flood extents or water surface elevation produced by using various DEM resolution.Hence, the implementation of Monte Carlo simulations for digital elevation models is more limited because they require alterations, for instance, of the resolution or accuracy of the DEM for each test, prior to the simulation process.Also, depending on the quality of the DEM to be tested, the hydraulic model used, as well as the size of the test site, the modelling time may increase dramatically.
Previous research has also rather consistently (cf.Brandt, 2016) concluded that DEMs of between 4 and 10 m resolution (or point spacing) are enough to model flood boundaries with sufficient accuracy (e.g., Werner, 2001;Casas et al., 2006;Raber et al., 2007).However, those recommendations are based on the total characteristics of the model results; i.e. the total areal extent may not differ much between the model and the true inundation areal extent.This means that local terrain conditions have not been accounted for, and hence, there may be quite large discrepancies at some locations.Although those may be few in number, they may still severely impact the suitability for planning, constructing, or managing urban areas.A first attempt to see whether the disparities between modelled and real flood boundaries could be linked via the quality of the DEM was reported in Brandt (2009), using two particular river reaches of the Eskilstuna River, Sweden.This was further elaborated in Brandt and Lim (2012), and later quantified by Brandt (2016) through an empirically derived equation together with an algorithm capable of producing uncertainty zones of varying width.As this empirical equation is based on only one single river, there is a need for comparative studies to verify the approach of modelling and visualising DEM and slope-dependent uncertainties.

Aims and objectives
The aim of this paper is to illustrate the relation between the DEM quality and the accuracy of flood modelling.Using the Testebo River, Sweden, as test area for the empirical-based disparity-distance equation and algorithm (from Brandt, 2016), that considers both DEM resolution and terrain slope to create prediction-uncertainty zones around resulting inundation extents of a one-dimensional (1-D) hydraulic model, the study's objectives are to: -assess whether the equation is applicable on rivers with different characteristics than the Eskilstuna River; -relate the DEM and slope contributions as part of the total flood boundary delineation's uncertainty; -identify both advantages and limitations of the method in accounting for model uncertainty based on DEM resolution and the slope characteristics of the floodplain.

Methods
To evaluate the applicability of the disparity-distance equation and algorithm by Brandt (2016) in deriving uncertainty boundaries that account for both the DEM resolution and the slope characteristics of the area, Lim's (2011) earlier modelling results for a part of the Testebo River (i.e.Forsby and Varva), north of Gävle, Sweden, were utilised as test cases.This area is characterised by flat floodplains, composed mainly of pasture lands.The results were derived from two different TIN elevation models.One from Lantmäteriet's (the Swedish mapping, cadastral and land registration authority) 50 m data, and one from filtered laser-scanned data (from SWECO), with an equivalent resolution of 2.1 m.The TIN models, which served as the main topographic data, were created from each elevation data set, in combination with river bathymetry comprising of echo-sounded and interpolated channel points (Lim, 2009).The hydraulic modelling was performed with HEC-RAS steady-flow model, using a peak discharge of 160 m 3 s −1 .This was equivalent to the 100-year flow in the area, and also to the flooding that happened in 1977, to which the validation data for the study were based on.
Because model results, especially from 1-D models such as HEC-RAS, can be affected by both the spacing and number of cross sections in defining the topographic characteristics of the area (Cook and Merwade, 2009), and how a modeller assigns parameters in the hydraulic model, these impacts on the delineation of the uncertainty zones were also looked at in this present study.Modelling results of Testebo River by both Brandt and Lim (cf. Brandt and Lim, 2012)  were used in determining uncertain areas and compared with each other.Although both were based on the same elevation model (i.e.LiDAR data combined with river bathymetry), cross sections, roughness coefficient and boundary conditions used varied between them.

Disparity calculations
The location of the modelled flood boundary will almost always not be located at the true flood boundary.In most cases they will be close to each other, but at some, they may be placed far away from each other.In order to quantify these differences, the distances and slopes (taken from the locations where the hydraulic model's cross sections intersect the flood boundary lines) were measured according to Fig. 1a.After the distance disparities were plotted against the slopes, a point cloud with a distinctive falling tendency of the highest disparities as the slope increases will appear (Fig. 1b).In Fig. 1b, it can also be seen that a DEM with poorer quality in general will produce higher disparities for the same slopes.
Based on the appearance of the point cloud, it follows that the disparity distance (D d ) can be expressed as a function of slope (S) [m m −1 ] perpendicular to the flow direction, resolution or cell size of the DEM (δ) [m], and percentile of interest or confidence level (P ) [%] for estimating the uncertainties.Taking S as the main dependent variable, it can be expressed as: where c and z are coefficient and exponent, respectively, containing δ and P .Based on the results from the Eskilstuna River, the quantile regression for a number of hydraulic simulations with different DEM resolutions yielded the following empirical equation (Brandt, 2016): 1124 ln(δ)+0.0709ln(P )−1.0064] , (2)

Algorithm
To translate the results from the D d equation to uncertainty zones around the modelled flood boundary line, the algorithm presented in Brandt ( 2016) was used.In short, the following procedure is carried out (Fig. 2): ( 1 (5) Finally, the water surface elevation models are compared with the DEM to produce three distinctive areas: almost certain to be flooded, uncertain to be flooded, and almost certain not to be flooded.

GIS implementation
The algorithm was implemented in GIS using the following data: (1) the flood polygon from the HEC-RAS modelling, (2) the river cross sections with the water surface elevation, (3) stream centreline, which was used for dividing the cross section into left and right parts (while looking downstream the channel), and (4) a DEM over the area.Two point layers were generated.The first layer consisted of points extracted at the intersection of the cross section and the flood boundary produced from model simulation.The x and y coordinates for these points were derived, together with the water surface elevation at the given location.The points were coded according to the reference number of the cross section where they were extracted, and if they were located at the right or left part of the channel (Fig. 3).The second layer were nodes sampled along each cross section that intersected the edges of the TIN model (Fig. 3).Similar to the first set of points where the flood extent and cross section intersected, these nodes were coded according to the cross-section number and their location at the channel.Furthermore, it was also determined if they were located inside or outside the modelled flood extent.Their x and y coordinates, together with the elevation information were extracted for these points.
To determine the uncertainty boundaries for the inner and outer flooded extents, four new elevation values have to be derived per cross section: an outer and inner elevation for the left and right parts of the channel, respectively.Beginning with the left-hand channel's first cross section and the sampled nodes, which were inside the flooded extent, their disparity from the cross-section/flood-polygon intersection was determined using Eq. ( 2), with 95 % confidence interval.This was computed node-by-node, starting from the node closest to the flood boundary, going towards the river's centre.The computation was stopped when the disparity was exceeded by the D d value from Eq. ( 2).The elevation information from this node was taken and assigned as the inner elevation for all the sampled nodes at the left part of the cross section.
In getting the outer flood boundary elevation for the left part of the channel, the disparity computation was repeated for the sampled points outside the flood boundary, beginning from the node nearest the intersection of cross section and modelled flood boundary.Again, once the disparity was exceeded by the computed disparity from Eq. ( 2), the computation was stopped and the elevation value at this node was used as the outer elevation for that cross section's nodes at the left channel.These steps were repeated for the sampled nodes at the right part of the cross section to get the inner and outer elevation boundaries.Then the same procedures were followed for the sampled nodes in the succeeding cross sections.
After determining the inner and outer elevations for all sampled nodes, two new point datasets were generated: one containing all points assigned with the outer elevation, and another having the inner elevation information.These point datasets were mapped and used for producing new uncertainty TIN models with the new elevation values.Each of the TIN models was then subtracted from the DEM to delineate areas that are flooded, uncertain, and not flooded.

Agreement between model and actual flood
The digitised extent based on the actual flooding in 1977 was compared with the result of the generated inner and outer uncertainty boundaries, in addition to the original modelling results.This was measured according to the feature agreement statistic for the overlap (FA) used in Raber et al. (2007): where the areal size of the overlap between the actual observation (F Obs ) and the modelled results (i.e.F ) is divided by the total areal size.The latter is computed from the sum of the size of the overlap, and the sizes of the areas that are overestimated (i.e.areas that are supposed to be dry but flooded in the modelled results) and underestimated (i.e.areas that are supposed to be flooded but are dry) in the modelled results.Based on this equation, the percentage of areas that were overestimated and underestimated were also computed to see how big part of the total they constitute.

Predicted uncertain and flooded areas based on the disparity-distance equation
The uncertainty zones produced using the modelled results from the 50 m data and the LiDAR data are represented by the red regions in Fig. 4, while those that will most likely be  flooded are located within the blue areas.The black and dark blue lines are the original boundaries derived from the modelling results and the extents of the actual flooding in 1977, respectively.The remaining areas are those that are foreseen to be dry by the equation.
Both the flooded and the uncertain areas were larger using the 50 m DEM, compared with LiDAR (Table 1).When the results from two different modellers are compared, the estimated size of the uncertain zone was the same (i.e.0.22 km 2 ), but not for the flooded areas; Brandt's flood area was 0.13 km 2 bigger than Lim's.If the flood area from the  50 m data is compared with both Brandt and Lim's results, the difference is 0.18 and 0.31 km 2 , respectively.The width of the uncertainty zones also depended on the characteristics of the terrain.In flat terrains, they were broader, while in areas that are confined with steep side slopes, they were narrower and closer to the original modelling result's extent.For laser-scanned data in relatively flat locations, the outer uncertainty boundary often extends until it reaches a location where the flow of water will be restricted by an elevated ground.

Comparison of predicted flooding extents with the validation data
Deviation from the 1977-flood data was most evident on the results based on the 50 m data.Almost certain to be flooded areas (i.e.inner uncertainty boundary) derived from the equation did not fit the observed data (Fig. 5).However, if the outer extent of the uncertainty zone will be considered, almost the entire 1977-flood area is inside the extent.
For the LiDAR data, the largest discrepancy between the modelled and the actual flooding extent existed mainly at the northern part of the study area (Fig. 5).In both modelling results, these areas were predicted to be flooded rather than proc-iahs.net/373/153/2016/Proc.IAHS, 373, 153-159, 2016 dry.The disparity from the observed data for the two results at this location was greater than 500 m.
In the south western portion of the river, there is an area supposed to be flooded that was modelled dry by both Brandt and Lim (Fig. 4).Despite the disagreement, the uncertainty in this area was relatively low, as manifested by narrower uncertainty zone sizes, because it bordered steeper terrain.The actual flood line was located above, on an elevated ground.
To quantify how much of the predicted flooding within the newly estimated uncertainty extents match the flooding of 1977, feature agreement statistics were computed for all the results, i.e. original modelled results and flooded areas within the inner and outer extents, as shown in Table 2. Overlapping areas with the actual data from 1977 were higher with the LiDAR data (almost 75 %), as revealed by the original results of Brandt and Lim, compared with the 50 m DEM (56.68 %).Although the overlap results varied for the inner and outer boundaries produced by both modellers (wherein higher overlap was computed from Brandt's inner boundaries, while the outer uncertainty boundaries from Lim's modelling results produced higher percentage), they registered between 70 to 75 % of the total, which were still higher than those produced from the 50 m elevation.
For the 50 m data, the assessed agreement of the flooded areas within the inner boundary was higher (59.49%) than the original modelling result (56.68 %), and the areas within the outer extent (41.52 %).This was attributed to a significantly larger resulting inundation area.Even the outer uncertainty area was overestimated by 58.42 % (cf.Fig. 5), and together with the lowest underestimation (11.21 %), it is clear that most of the original inundation zones were already within the limits.

Discussion and conclusion
By using the disparity-distance equation, the underlying resolution of the DEM affected how certain or uncertain areas are to be flooded.It was evident in the results that poorer resolution data, such as the 50 m DEM, produced bigger uncertainties than the LiDAR-based elevation models.Because the details in the terrain were lost with the 50 m data, already flat terrain areas became even flatter.With the laser-scanned data, the effect of flat terrain was also demonstrated in terms of the thickness of the uncertainty zones.The modelled result's uncertainty becomes higher as the slope becomes flatter, while minimal uncertainty was attained in areas with steep-sided slopes.This is because when there is a significant change in the elevation value, the equation and algorithm prevent the uncertainty to go further in the perpendicular direction of the main flow.It must also be noted that all terrain elevation data used in the study were supplemented with bathymetric data.Thus, the estimated uncertainty using the equation may be higher, particularly in flat areas, if the data had not been supplemented with river bottom elevations.Lack of river bathymetry can generate gentler slopes, making the modelled water flow over the floodplain, and extend the water surface extents (Lim, 2009), as well as making the inner uncertainty zone broader towards the river.
This study also shows that the effect of resolution is more significant than the effect brought about by the different modellers using different cross sections, roughness values, and boundary conditions.Based on the feature agreement assessment, the values derived for both modellers did not differ much from each other, as when compared with using the 50 m and the LiDAR data.Their overestimated area was in the flat part of the floodplain, with slope values around 0.002 m m −1 .Whether this area will really be flooded or not is difficult to determine, as there may be other factors that also affect the lateral flow, adding to the uncertainty of the flood delineation.On the other hand, the underestimation in both their models occurred in locations where water flow was limited by steep side slopes.In this case, the accuracy of the validation data became in question.
The relatively small differences among the modelled total flooded areas (1.02, 0.84, and 0.71 km 2 for 50 m, Brandt's LiDAR, and Lim's LiDAR, respectively) compared with the striking difference in uncertain areas (0.93, 0.22, 0.22 km 2 ) between DEM resolutions (but not between modellers), indicate that the effect on the total area is mostly related to parameters such as roughness, whereas the size of the uncertainty zone will be more related to the resolution of the DEM.It did not matter that the modelled boundary of Brandt was generally situated at a distance outside Lim's boundary.
The discrepancy between the 1977 flood and the uncertainty zones produced makes it clear that DEM and slopedependent uncertainties alone do not account for the total uncertainty.Uncertainties in models have been recognised in different literature (e.g.Pappenberger et al., 2005Pappenberger et al., , 2008;;Schumann et al., 2007;Werner et al., 2005a) to be affected by other factors such as the roughness values, boundary conditions and even the presence of structures (both in the river and the floodplain), which could have also affected this discrepancy derived in the produced results.The disparity-distance equation was developed with the Eskilstuna River as base and with the DEM as the only parameter that was varied (cf.Brandt, 2016).Nevertheless, even though it was developed on another river, it most probably provides valuable input in the quantification of the uncertainty zone.The weakness of the equation is that the geographical area it was developed for did not have large areas of very flat slopes.Hence, the equation probably underestimates the disparity distances for coarse resolution DEMs (i.e. the 50 m in this study) (Brandt, 2016), and therefore, the uncertainty zone for the 50 m DEM is probably even bigger than this study indicates.In a flood mapping project, a good idea would be to first produce a sensitivity analysis using the regular inputs (e.g.different roughness), and then apply the disparity distance algorithm on the smallest and the largest inundation extents, respectively.
Finally, it can be concluded that it is possible to illustrate the relation between the DEM quality and the accuracy of flood map by using uncertainty zones.Furthermore, the disparity-distance algorithm (Brandt, 2016) can be used not only for existing (e.g.where extra precaution is needed) and future flood risk maps, but also on other rivers than Eskilstuna River, provided that the analyst and end users are aware that the DEM uncertainty only provides a part of the total uncertainty of the flood boundary delineation.

Figure 1 .
Figure 1.(a) Calculation of disparity distances.(b) Relation between disparity distance and slope in the Testebo River (values from Lim, 2011).
) The modelled flood polygon, the hydraulic model's cross sections, and the DEM are used as input.(2) In an iterative process (i.e.going from the cross-section/modelled flood-boundary intersection both towards the centre of the river and away from the river), first, calculate the slope and distance between the node and intersection.Next, check whether the distance exceeds the D d distance for the corresponding slope.(3) When D d is exceeded, inner and outer uncertainty elevation values are recorded.(4) The cross sections are populated with the uncertainty elevation values and from these, two water surface elevation models are created: one with inner uncertainty elevations and another with outer uncertainty elevations.

Figure 3 .
Figure 3. Points sampled at the cross sections intersecting the flood boundary, and nodes at the cross sections intersecting the edges of the TIN model.

Figure 4 .
Figure 4.Estimated uncertain and flooded areas using the disparitydistance equation algorithm.Results were based on DEM resolutions of 50 and 2.1 m (LiDAR).

Figure 5 .
Figure 5. Differences in the flooded areas within the generated inner and outer boundaries from the equation, in comparison with the real flooding data.

Table 1 .
Total sizes of areas predicted to be uncertain and flooded using the disparity-distance equation.

Table 2 .
Results of feature agreement statistics for the different modelling results derived from DEMs having different data resolution, and as produced by different modellers, compared with the flood of 1977.