A probabilistic model for predicting seasonal rainfall in semi-arid lands of northeast Brazil

In most of the northeast region of Brazil rainfall is relatively low, presenting significant inter-annual fluctuations, especially when compared to rainfall in other areas of Brazil. Moreover, evaporative rates (like the ones found in the northeast semi-arid region) are too high, sometimes reaching over 2800 mm annually. Owing to such a climate character, very large areas in northeast Brazil are subjected to recurrent droughts. This paper presents a methodology for the prediction of seasonal rainfall in semi-arid lands of northeast Brazil. A total of 72 raingauge stations of Paraiba State, and 84 in Ceará State were employed, all of them distributed in three and seven homogeneous areas, respectively. A rainy season with different subdivisions was established for each homogeneous area. The zi proportions – the ratio between the cumulative rainfall of the first rainy season period and the rain that falls during the whole rainy season were made to fit the Beta probabilistic model used for calculating the second and eighth deciles and the probability of rainfall above the average rainfall for the second period of the rainy season. The performance of the prognostic model for individual stations of Paraíba State in the period 1996–2000 was evaluated. In the period 1996 to 2000, with rainfall above average, the error was less than 20%. The methodology adopted proved very accurate for forecasting droughts in northeast Brazil.


INTRODUCTION
The weather and climate in northeast Brazil are highly influenced by the phenomena associated with various scales: from the planetary scale to the small scale, represented by isolated convection.The atmospheric circulations associated with anomalies in the sea surface temperature (SST) -such as the ones that most characterize the phenomenon El Nino, the Southern Oscillation (ENSO), the Atlantic dipole, the Atlantic subtropical anticyclones, the inter-tropical converging zone (ITCZ) and the 30-60 days oscillation belong to the planetary scale.On a synoptic scale, the austral frontal systems or the converging zones originating from them, the high troposphere cyclonic vortexes and the eastern disturbances have been given considerable relevance.In northeast Brazil, mesoscale phenomena occur, the systems coming from the ITCZ (mesoscale convective complexes and instability lines that build up along the northern coast of northeast Brazil) and the circulations occasioned by thermo contrast between solid and liquid surfaces.In most parts of the region, rainfall is relatively low, presenting significant inter-annual fluctuations, especially when compared to rainfall in other areas of Brazil.Moreover, evaporative rates like the ones found in the Northeast semi-arid are too high, sometimes reaching over 2800 mm annually.Owing to such a climatic character, very large areas in northeast Brazil are subjected to recurrent droughts.
In spite of the great advances in the last 10 years, rendering the dynamic and statistical models far more efficient, too much divergence still persists among climate forecasters as regards their predictions, which serves to demonstrate the high degree of complexity that climate forecasting entails, at the same time showing that meteorology still has a long way to go in order to satisfy the scientific community where accuracy and anticipation of prognostics are concerned.While one tries to reach the accuracy and the anticipation needed, one should apply the existing models so as to provide the answers that would implement the planning and the decision-making procedures aimed at minimizing the impact of undesirable climatic conditions on the region, mainly those conditions affecting farming and water resources.Silva (1985) suggests a new probabilistic technique that differs a lot from the ones that have been developed so far (Ward and Folland 1991).However, this new technique has been successfully employed in a variety of pluvial regimes in northeast Brazil.The same model was employed by Azevedo et al. (1998) in the State of Ceará, aiming at predicting rainfall for the second half of the rainy season (from 20 March to 30 June) for each homogeneous area in Ceará State.The methodology proposed by Silva rendered itself rather effective in assessing the rainfall minimum and maximum values for the second half of the rainy season in various homogeneous micro-regions in Ceará, mainly the minimum values for the 1960-1969 period.Santos et al. (2002), employing the Silva (1985) method, used monthly totals from 34 raingauge stations on the eastern coast of northeast Brazil, which extends from Rio Grande do Norte State down to Sergipe State.This methodology proved rather effective at predicting the maximum and minimum rainfall values for the second period of each rainy season for both the northern and central region, whereas for the southern region it proved far less effective.
The present paper aims at producing a methodology based on the Silva (1985) climatic model for predicting rainfall in semi-arid lands of northeast Brazil.As for the specific aim, a technique is presented to determine the probability of occurrence of rainfall higher than the climatic value for the RS second period.

DATA AND METHODS
The study region The region selected for illustration of the study (Fig. 1) were the states of Paraíba and Ceará, on account of the existing facilities in the States, such as Laboratório de Meteorologia, Recursos Hídricos e Sensoriamento Remoto -LMRS in Campina Grande city, and FUNCEME -Fundação Cearence de Meteorologia, located in Fortaleza city.The LMRS and FUNCEME conduct weather and climate monitoring for Paraíba and Ceará states, respectively, and benefit from Silva (1985) findings.Moreover, there exist in Paraíba a number of different pluvial regimes, which makes it easier to evaluate the effectiveness of this technique.The data used The data, on which the present survey was based were all collected from 72 raingauge stations in Paraíba and 84 in Ceará, during the period 1910-1990.The data are available in digital archives at the Academic Unit of Atmospheric Science, Federal University of Campina Grande.The data used for validation of the model, that is, from 1996 to 2001, were provided by the LMRS.The stations consulted -with their geographic coordinates -and the study mesoregions are found in Silva et al. (2004) and Azevedo et al. (1998).
Definition of study mesoregions Three homogeneous areas for the State of Paraíba, here defined as Western (Sertões), Central (Cariri) and Eastern (Zona da Mata and Litoral), and seven for the State of Ceará, have been considered for the present work.These homogeneous areas were identified and have been used by the LMRS for a number of purposes, including weather and climate forecast.
In the case of the Silva (1985) model, it may be seen in the next section, the rainy season (RS) and their two periods for all localities must be identified.Based on the "Atlas Climatológico do Estado Paraiba" (Varejão-Silva et al. 1987) the RS for the Western, Central and Eastern homogeneous areas includes the periods from January to June, January to July and from January to August, respectively.For each mesoregion, the rainy season initial periods with differing duration were taken into account.This would determine a prognostic for two or more distinct RS occasions.The two different periods used for each particular RS at the Sertões mesoregion are found in Table 1.In order to show another way of exploring the technique, for the State of Ceará only the RS of the period from January to June was considered, comprising all the stations of the same homogeneous area.The Silva Model The Silva (1985) model considers the corresponding proportions to the ratio between rainfalls that occurred during the first period (X) of the rainy season -RS.Given that the rainfall of the remaining RS period is termed Y, a Z proportion, as described by X/(X+Y), would be provided for each year for the study period.As a result, the Z values vary from 0 (zero) to 1 (one).On different occasions (Silva 1985), it has been considered for the Paraiba hinterland that X would stand for the months of January, February and the first 19 days of March (JF, JF19M, JFM) and the RS would encompass the period from January to June (JFMAMJ).By means of this model it is possible to predict with different probabilities the maximum (and minimum) rainfall for the second period (Y) of the rainy season.An 80% probability level has been normally used (Azevedo et al. 1998, Santos et al. 2002), but other levels may be considered.Once the RS and its initial period are identified, the Z proportions will then be established on a year-to-year basis against a set of data that have been collected for not less than 30 years.The Beta probabilistic model is then applied to the set of z i values because such a model is applicable to proportions, on which occasion its adjustment is checked against the already mentioned z i set, in accordance with the Kolmogorov-Smirnov test.Once it is confirmed that the Beta model has successfully represented the z i set of values, one proceeds to estimate the second and eighth deciles, which will eventually be used to establish the prognostic.
Should Q 2 be the second decile of z i proportions, one can surely count on an 80% value probability of equal or greater than Q 2 .Thus, for a given year, selected at random, one would see that: according to which the possibility of z i results larger than Q 2 is likely to occur.However, on considering the eighth decile (Q 8 ) of the proportions concerned, one can easily state that there is an 80% probability of occurrence of values less than or equal to Q 8 , which would be the same for a given year, selected at random; it therefore follows that: where X i is the rainfall for the first RS period of that year.Developing equations ( 1) and (2) further, one obtains the accepted maximum and minimum rainfall heights with 80% probability level.
It may be noticed that z i is related to rainfall belonging to two different time-periods.Once some statistical and rainfall features of their X i related periods are known, the Y rainy prognostic is made, and that constitutes the core of the Silva (1985) model.

Beta probability density function
A continuous, independent, and random variable Z, with values set between zero and one, will distribute itself, according to the Beta probabilistic model, if its probabilistic density function is given by (Wilks 1995): where a and b are the parameters of the model, B(a,b) is the Beta mathematical function and Γ represents the Gamma mathematical function.
The Beta model parameters evaluation and the Goodness-fit test The method of maximum likelihood was used for assessing the parameters a and b of the Beta probabilistic model, and that was accomplished according to solutions recommended by Mielke (1976).The efficiency of the Beta probability distribution model was considered by the Kolmogorov-Smirnov (K-S) test, a nonparametric test, as it can be applied to small samples unrestrictedly.
The assessment of the probability of occurring rainfall above the average value of Y In order to calculate the probability of rainfall above the RS second average value in a given locality over a given year, it is enough to determine the probability of Z being equal or less than Z ave , where Z ave = X i / (X i + Y ave ), and where X i would correspond to the rainfall recorded during the RS first period for that particular year, and that Y ave would corresponds to the average value for the rainfall of the RS second period.This can be obtained using the Beta model.

RESULTS AND DISCUSSION
Adjusting the Beta Model For the Western homogeneous area (Sertões) 37 stations and three different periods were assigned in order to establish the expected rainfall prognostic for the RS second period, which corresponds to the beginning of March, the middle of March and the beginning of April.This matches the Beta distribution application to 111 samples.To all of them the Kolmogorov-Smirnov test was applied at a significance level of α = 0.10.The Beta model adjusted satisfactorily to all samples under investigation, which shows the high significance of the Beta model.When applying the Beta model to the seven mesoregions of the Ceará State, good fits were verified.Santos et al. (2002) applied the Beta model to z i proportions from 34 stations located on the northeast Brazil eastern coast for different rainy seasons.The Beta best fittings were obtained from Natal-RN data, Itabaiana-PB, Mamaguape-PB, and Palmares-PE.The poorest fittings occurred at Propriá-SE and Aracaju-SE, notwithstanding the fact that the null hypothesis (Ho) for a = 0.10 has been accepted.

The Silva Model Performance
The performance of the Silva model was estimated in order to see the successes and mistakes related to the expected Ymax and Ymin predictions for the RS Y period in 2000, serving only to illustrate the method's many potentialities.Table 2 exhibits the results of the application of the forecasting model for the year 2000 and for the rainy season RS1.The same table shows the second decile (Q 1 ) and eighth decile (Q 4 ) of z i proportions, the average rainfall for the RS first (Xm) and second (Ym) periods, the rainfall values for the first (X 2000 ) and second (Y 2000 ) period of the 2000 RS, the maximum (P max ) and the minimum (P min ) rainfall values predicted by the Silva model for the second part of the 2000 rainy season, the z i proportion values for the year 2000.Also in Table 2, one can see that the forecast of Y min failed in 10 years (bold numbers), which represents 23%, when the expected one would be 20%.In the case of Y max , the forecast presented 100% success.
All the results predicted for the whole period 1996-2001 in Sertões of the State of Paraíba for the RS1, are represented in Table 3.The first period of that RS is formed by January and February (JF), and the second one by March to June (MAMJ).On examining the results, one verifies a 89.19% success in predicting maximum values, and 97.29% success in predicting minimum values for 1996 year.The results obtained by the Y max and Y min predictions were rather satisfactory, resting within range of the probable (86.08% for Y max and 78.97% for the Y min ).In the State of   (2002) have obtained similar results after investigating raingauge stations on the eastern coast of northeast Brazil, and so have Azevedo et al. (1998) when they studied the homogeneous microregions of the State of Ceará.For all RS the results are extremely precise for rainfall forecasting, mainly for the Y max .Considering all investigated years, 1998 and 2001 were the ones that presented the largest differences between predictions involving all rainy seasons.One can see that for all rainy seasons, the X 1998 values exhibited low z i values, and consequently reduced P(Y>Y clim ).This means that 1998 may be considered a non-typical year, given the rise in discrepancies between the probabilities P(Y>Y clim ) and the rainfall above the Y clim climatic average at every station and RS.It may be important considering that 1998 was the year of the last century with the strongest El Nino.As to the predicted maximum and minimum rainfall, the success indexes were very good, which therefore recommends the use of the Silva model in the LMRS and FUNCEME operational routine.Moreover, there is strong evidence towards the application of a filter in case of extreme events (rainy days), which would enhance the Silva model performance.

Fig. 1
Fig. 1 Study region with the raingauge stations and water basins of Paraiba State.

Table 1
Rainy Season -RS and the first and second periods for the Sertões (Western) mesoregion, and Ceara State.

Table 2
Raingauge stations of Sertões with their correspondents data for the RS EC1: second (Q 2 ) and eighth (Q 8 ) deciles of z i , mean values of the first (X m ) and second (Y m ) period of EC1; forecasted maximun (P max ) and minimum (P min ) rainfall values for 2000 year; and the proportion variable for 2000 year -z i .

Table 3
General results (%) of forecasting success into the period 1996 to 2001 for RS1, RS2 and RS3.Ceará for the central area (mesoregion E6), according to Table4, and in the period 1960-1969 both Y max and Y min presented just one failure, which shows the importance of the model.Santos et al.

Table 4
Results for X i , Y i , Y max and Y min associated with the raingauges of EC6 in the State of Ceará.