A European Flood Database: facilitating comprehensive flood research beyond administrative boundaries

The current work addresses one of the key building blocks towards an improved understanding of flood processes and associated changes in flood characteristics and regimes in Europe: the development of a comprehensive, extensive European flood database. The presented work results from ongoing cross-border research collaborations initiated with data collection and joint interpretation in mind. A detailed account of the current state, characteristics and spatial and temporal coverage of the European Flood Database, is presented. At this stage, the hydrological data collection is still growing and consists at this time of annual maximum and daily mean discharge series, from over 7000 hydrometric stations of various data series lengths. Moreover, the database currently comprises data from over 50 different data sources. The time series have been obtained from different national and regional data sources in a collaborative effort of a joint European flood research agreement based on the exchange of data, models and expertise, and from existing international data collections and open source websites. These ongoing efforts are contributing to advancing the understanding of regional flood processes beyond individual country boundaries and to a more coherent flood research in Europe.


Introduction
Flooding is a re-occurring hydrological phenomenon that often does not affect a single catchment but rather goes beyond catchment boundaries resulting in a regional hazard that transects political boundaries. Over the last decades, there has been an apparent perception that the amount, magnitude and frequency of floods in Europe has been changing. However, catchment or regionally based analyses of these variables in the scientific literature have shown little consistency on whether floods have increased or decreased in Europe (Hall et al., 2014). Therefore, to further scientifically investigate this matter, new insights into floods in general, flood generating processes and their associated changes are needed.
Resulting from the wide spatial context in which floods generally occur, a better understanding can be gained by analysing observed floods in both regionally extended and longer-term contexts. For example, Europe has several hydro-climatologic distinct sub-regions, which offer a unique set of large-scale study areas in which observed floods can be compared and analysed.
To improve our understanding of flood processes and associated changes in flood regimes at a European scale, a single extensive large scale database is essential (Hall et al., 2014), as currently this data is dispersed in various national and international data sources, some of which are only recorded on paper. Under the umbrella of the European Research Council (ERC) project "Deciphering River Flood Change" (FloodChange), a joint European flood change research network has being established.
Within the network, the time series of the European Flood Database were obtained from different national data sources in a collaborative effort of a joint European flood research agreement based on the exchange of data, models and expertise, and from existing international data collections and open source websites. In the course of developing this database, an attempt is being made to bring all these different data sources together, and to merge the different formats of existing or new (not yet digitised) data sources into a comprehensive European Flood Database. These ongoing efforts in building the database are a key building block and cross-border research collaborations are contributing to advancing the understanding of regional flood processes beyond individual country boundaries and to making flood research more coherent in Europe.
In the following sections, a detailed account of the current state, spatial and temporal coverage of the preliminary state of European Flood Database (as of January 2015), is presented together with a summary and outlook on the still ongoing work on the database.

Data sources of the European Flood Database
The raw hydrological data that has been compiled and transformed with the aim to be merged into the European Flood Database has been supplied by many different individual, national, and already existing international data sources. The database comprises of time series from either daily mean discharges, annual maximum peak discharges or both variables, depending on data availability or current data distri- bution/sharing policies in place. For few countries that are included in the database, only water levels time series are available, though no discharges. Although these data are not directly comparable with discharge, water level series were included into database to supplement the temporal and spatial coverage of the data.
In Fig. 1, the current database national coverage and data source labels of already existing international collections are depicted, together with the detailed data source information (Table 1). For some countries such as Germany or Italy, where the hydrological data is not centrally collected and archived, several different regional data sources exist and were depicted with the same combined national label. It has to be noted that for most countries, only a selection of the countries' available gauging stations has been shared by the data holders. Therefore, the spatial and temporal coverage within the database does not correspond to the entire national observational networks.

Flood Database characteristics
All the series that are integrated into the database are checked for consistency for both the actual flood data and their metadata (Station Name, River/Stream, Station elevation, and Catchment Area). If problematic or incorrect information is detected, data providers are involved whenever possible to correct the data appropriately. Additionally, as the hydrological series stem from different sources, some stations have duplicated entries in the database (e.g. one from an international data source and a regional source). Currently there is still extensive work being done in identifying such duplicates and, if the data quality permits, in merging/amending the time series and the associated metadata.
To date (January 2015), the preliminary database comprises over 7000 hydrometric stations of various data series length, with more than 60 % of the stations spanning over more than 40 years. The database is still expanding through new data series, even through digitalisation of new series or through completing current already existing time series with data from different data sources.

Spatial coverage
The aim of the database is to obtain a homogenous flood database across Europe. In practical terms however, depending on data availability and country size, different countries and regions in Europe differ in the number of stations in the database (Fig. 2) and their station density (see also Fig. 1).
If the spatial coverage of the stations is examined together with other station attributes such as catchment elevation (Fig. 3) and catchment area (Fig. 4), it becomes apparent that not only the spatial density across Europe is highly variably but also the other spatial characteristics. For example, higher elevated stations such as the Alps or the Pyrenees are well covered, whereas the Carpathians, the Kjolen and the Balkan Mountains show a lower density. Central Europe exhibit a better coverage for lower lying stations and for larger catchment areas, compared to for example Eastern Europe. Although the station characteristics are diverse, the extended spatial coverage will allow for advancing the understanding of regional flood processes beyond individual country boundaries.

Temporal coverage
The spatial coverage of the preliminary database can be considered satisfactory for investigating changes in floods and the associated flood generating processes. However, the temporal coverage of the database shows a more inhomogeneous appearance for some regions. These inhomogeneities become particularly evident when analysing the annual maximum flow series, either the actual peak flow series or derived from daily mean discharges (Fig. 5).
Several reasons for the differences in record length exist. The most important is for historical reasons with station records in different regions across Europe commencing in different periods (Fig. 6). Second, the availability of the most recent or updated series also plays an important role (Fig. 7). Particularly data that has been acquired from already existing international data sources (such as the EWA in Eastern Europe) exhibit short record lengths, due to difficulties in obtaining updated national data beyond the end of already existing series. In addition, it is difficult to obtain updated time series after 1970 in Italy, when the last national publication of annual maximum series was published and after institutional changes occurred, resulting in the national hydrological service being split into several regional services.

Summary and outlook
The preliminary state of the European Flood Database (January 2015) has been presented. The database will be further developed and the associated research questions will be approached through further collaborations and research agreements based on the exchange of data, models, staff, and expertise. The current direction of further database development aims to increase the temporal coverage so that the satisfactory spatial coverage can be maintained across all time periods. This includes a focus on the last 40-50 years to allow for a better understanding of whether floods in Europe have been changing and if so when, where and how the changes have occurred.
A key element of the database that needs to be expanded over the course of database development is information associated with the degree of human modifications on the river system related to the station and therefore the time series. Currently, little metadata on this is in the database, but when data providers with detailed regional knowledge were involved in obtaining the time series, one of the requisites was to have low human influence present in the time series. This  pre-requisite could not be achieved for some regions due to the high degree of modification of the catchments. Ultimately, when analyses on the database are performed, local knowledge on rivers and the degree of human influence are important and will be considered through the contributions of the established network.
The database development is still ongoing; therefore, researchers and data providers interested to join the European  flood change research network are welcome. Generally, due to legal restrictions and economic concerns, various agreements with the data providers have to be signed, before data can be incorporated into the database. Depending on these agreements, the use of the time series is restricted to the FloodChange Project or the data can be used for joint research within the network. Due to these legal restrictions, the database cannot be made freely available and is only ac- cessible for partners involved in the research network. The final large-scale database will be used for joint research to improve our understanding of flood processes and associated changes in flood regimes at a European scale. Once European wide patterns of change have been identified, the acquired understanding of the drivers will facilitate an improved understanding of possible future flood regime changes.