Evaluation of hydrological parameters of the Goiana River basin in the State of Pernambuco using the automatic calibration tool of the hydrodynamic model PCSWMM in multiple fluviometric stations

The objective of this study was to perform a sensitivity analysis and automatic calibration of the hydrological parameters of the Goiana river basin in the state of Pernambuco. A hydrological modeling structure for the basin was built using the PCSMMM (Personal Computer Storm Water Management Model) hydrodynamic model and the analyzes were carried out in 10 sub-basins. Based on the model developed for the basin, the resources available in the PCSWMM for sensitivity analysis and automatic calibration, known as SRTC (Sensitivity-based Radio Tuning Calibration), were explored initially to evaluate the most sensitive parameters and then to perform the calibration of basin parameters. The daily flow data were extracted in 5 fluviometrics stations present in the main water courses of the basin and the precipitation data in 10 pluviometric stations. The results of the simulations were analyzed for the following statistics: Integral Square Error (ISE), Nash-Sutcliffe Efficiency (NSE), Determination Coefficient (R), in addition to the qualitative analysis of the hydrograph modeled in relation to the observations. The most sensitive parameters were the Curve Number (CN) and Manning's roughness for the permeable ones. In three control points (Engenho Retiro, Caricé and Nazaré) the SES values were 0.919, 0.899 and 0.884, respectively. The results, in general, indicate a very satisfactory performance of the tools of automatic calibration and sensitivity analysis of the model, besides being effective in relation to the traditional approaches of calibration by trial and error.


Introduction
Conceptual runoff models, whether in urban or rural areas, require a large number of variables and parameters to adequately describe the complex relationships between rainfall, runoff and river basin characteristics. The difficulties in estimating these parameters often end up limiting the use of such models. Therefore, the performance of the models in predicting the hydrograph is highly dependent on the accuracy of its calibration (Loucks et al., 2005).
Due to the presence of several optimal solutions and the large number of parameters considered, the trial-and-error calibration approach often used is tedious and time-consuming. It is therefore necessary to develop robust and reliable automatic calibration procedures to obtain the ideal parameters for the model (Javaheri, 1998).
Traditional calibration methods are applied to create model results that are in accordance with the measured outputs, the agreement is reached by the appropriate choice of parameters. To improve the accuracy of the calibration and the reliability of the models, as well as streamline the calibration process, an automatic calibration method becomes desirable (James, 2005).
Automatic calibration algorithms coupled with a hydrodynamic flow model have been increasingly used by researchers in order to improve the accuracy and efficiency of parameter adjustment processes. Formiga et al. (2016) made modifications to the structure of the SWMM model in order to allow the coupling of the R-NSGA algorithm (Evolucionary Reference Point Based Non-Dominated Sorting Genetic Algorithm) to perform automatic calibration of model parameters.
James, Wan e James (2002) implemented in PCSWMM a genetic algorithm for automatic calibration and design optimization. Gou et al. (2020) used an automatic calibration structure which combines a sensitivity analysis (SA) adapted to an optimization algorithm for calibrating a hydrological model. And Wang et al. (2020) developed a routine for automatic parameter calibration of the SWAT model based on a genetic algorithm (GA) and particle swarm optimization (PSO). Research, Society andDevelopment, v. 12, n. 2, e15011225331, 2022 (CC BY 4.0) | ISSN 2525-3409 | DOI: http://dx.doi.org/10.33448/rsd-v11i12.25331 3 Typically, automatic calibration methods based on a sensitivity analysis involve four main steps: (1) sensitivity analysis; (2) determination of the limit or range of parameters (uncertainty analysis); (3) calibration and (4) performance evaluation of the calibrated model (James et al, 2002) The objective of the sensitivity analyzes is to prepare the calibration of the basin parameter sets in order to illustrate the impact of the uncertainties of the input parameters on the model responses (Garambois et al., 2015). Therefore, the intention is to verify the rate of change in the response of a model in relation to changes in the input parameters of the same model (James, 2005). After the sensitivity analysis, it is necessary to determine the acceptable ranges of the parameters that will be adjusted in the model, also called the uncertainty analysis. In this step, an attempt is made to describe the entire set of possible results of the model parameters, together with their associated probabilities of occurrence, or simply to assign limits of physically acceptable minimum and maximum values (Loucks; Beek, 2017).
The performance analysis of the calibrated models, involves qualitative evaluation through visual comparisons between the time series of simulations and observations to identify how well the model captures the time, trends and magnitudes of the observations. And quantitative assessments, which assess the level of agreement between simulations and observations using statistics that measure the quality of the fit, which are used as criteria for accepting or rejecting the calibration results (Ahmadisharaf et al., 2019).
Complementing the analysis of the model's performance evaluations, Daggupati et al. (2015) argued that the use of quantitative statistics alone can be misleading and unreliable. Graphical comparisons together with quantitative statistics result in a better assessment criterion.
The present study uses the automatic calibration resources of the PCSWMM hydrodynamic model, to estimate the hydrological parameters of the Goiana river basin in the state of Pernambuco, using five flow observation points distributed in three of the main rivers in the basin. The performance of the model's response is analyzed by means of statistical measures of fit quality, such as the Nash-Sutcliffe Efficiency (NSE), the determination coefficient (R²) and the Integral Square Error (ISE). In addition to the qualitative evaluations of the simulated hydrographs in relation to the observations, analyzing the maximum flows and the trends.

Materials and Methods
The methodological strategy to achieve the objectives of the study, involves the following steps: (1) Construction of the hydrological model in the PCSWMM; (2) Sensitivity and uncertainty analysis of the model parameters; (3) Automatic calibration through multiple fluviometric stations and sub-basins; (4) Performance analysis, quantitative and qualitative, of the model. Figure 1 shows the flowchart of the research steps. Source: Authors (2021).

Study area
Study area corresponds to the hydrographic basin of the Goiana River, located in the eastern northern portion of the State of Pernambuco, between the coordinates 07º22'20" and 07º54'47" south latitude, and 34º49'06" and 35º41'43" west longitude.  The lower part of the basin has characteristics of hot and humid climate, with rainfall averages above 1.000 mm per year, reaching more than 2.000 mm in coastal areas. The rainy season lasts six months, from March to August (CONDEPE/FIDEM, 2005).

Pluviometry and Fluviometry of the region
In this study, the Goiana River basin was subdivided into 10 sub-basins using the following criteria for division of 9 of these: sub-basins with exudatory at the fluviometric stations (control points) and sub-basins with exutory in the basin's reservoirs. The last sub-basin, at the mouth, is the sub-basin corresponding to the Goiana River, which does not have a fluviometric station.
In total, the basin has 5 fluviometric stations all operated by ANA (National Water Agency) and 4 reservoirs, as shown in Tables 1 and 2. For the purposes of this study, the sub-basins were named after the fluviometric stations or reservoir in their exutory ( Figure 3).  For calibration, daily flow data were extracted in the period from 01 to 30 June of the year 2013 and for validation, the period between 01 to 30 September of the following year, 2014. These periods were chosen because they present data consistency, the absence failure and because it encompasses periods where intense rains have occurred in the region.
Daily precipitation data were obtained from 10 pluviometric stations (Table 3), all operated by ANA, with 2 stations within the limits of the Engenho Volta, Engenho Retiro and Itapissirica sub-basins, 3 stations within the Nazaré sub-basin and 1 station in the Caricé sub-basin.
The PCSWMM model associates only a series of precipitation data for each sub-basin. Therefore, the interpolation method weighted by the inverse of the squared distance was used to obtain rainfall in each sub-basin. This method considers that the rain in a place (centroid of the sub-basins), can be calculated as a weighted average of the rain in pluviometers of the region. The weighting is done in such a way that the nearest pluviometric stations have a greater weight in the calculation of the average, according to equation 1 (Collischonn;Dornelles, 2013).

Hydrodynamic model PCSWMM
The PCSWMM (Personal Computer Storm Water Management Model) rainwater management model, based on the SWMM (version 5) of USEPA (United States Environmental Protection Agency), integrated with a geographic information system (GIS), is the model used in the present study. In this model, each sub-basin is represented as a rectangular surface that has a uniform slope ( ) and a flow width ( ) that converges the flow to a single point in the outlet channel. Runoff is generated by modeling the sub-basin as a non-linear reservoir (Rossman, 2015).
The main input data for the model are: the sub-basin area (ℎ ), average slope (%), runoff width ( ), percentage of impermeable area (%), Manning roughness for impermeable and permeable areas and depth of storage in the depressions of impermeable and permeable areas ( ). And the infiltration model, which was chosen the Curve Number (CN).
The SWMM model is traditionally used in basin modeling in small and medium-sized urban areas. Usually as a tool for managing and detecting flooding points. With the insertion of new features, such as a GIS environment and tools for sensitivity analysis and automatic calibration, in the PCSWMM version, motivated the use in other types of water systems. In this study the model will be tested for its performance, in a basin with rural predominance and large dimensions.
Munir, Ahmad and Hafeez (2019) proved the good performance of the PCSWMM model in large rural basins, using the model's automatic calibration tools, found Nash-Sutcliffe efficiency values (NSE) and determination coefficient, between 0,75-0,97 and 0,94-0,98, respectively. As well as Ghofrani, Sposito and Faggian (2019), who also used the PCSWMM model in a predominantly rural basin.

Sensitivity analysis and calibration
The PCSWMM uses the tool known as SRTC (Sensitivity-based Radio Tuning Calibration), calibration of radio tuning based on sensitivity, to perform the calibration of hydrological models in relation to the observed data or to test sensitivity attributes and parameters. The SRTC tool requires estimates of uncertainties for the attributes and parameters to be calibrated. SWMM (version 5) is executed using the limit values of the sensitivity range specified by the user. The PCSWMM provides the user with the option of considering intermediate values within the limit specified for each parameter in the sensitivity graphs, making it possible to check whether there is linearity or not in the sensitivity of a parameter in relation to its response (CHI WATER, 2021).
Therefore, the first step in calibrating the model was to select the most sensitive parameters and estimate a range of uncertainty for them. This uncertainty indicates the limits in which the user feels comfortable with the adjustment of the parameter, being expressed as a percentage deviation (more or less) from its current value (best estimate).
The values of the lower and upper parameters for a specified uncertainty are calculated by the PCSWW according to equations 2 and 3 (Chi Water, 2021): Lower parameter values: Higher parameter values: Where is the parameter value for the lower limit of the uncertainty range; is the parameter value for the upper limit of the uncertainty range; is the current value of the estimated parameter; and (%) is the uncertainty considered.
The strategy used to calibrate the model was to carry out the joint analysis of the sub-basins separated by the corresponding rivers, that is, the Engenho Volta, Caricé, Guararema and Tiúma sub-basins, for example, drain their waters into the Capibaribe River -Mirim, therefore, the calibration of the parameters of these sub-basins was analyzed together, considering the two control points existing in this river. The same procedure was also used for the other rivers, the Siriji river with the Engenho Retiro and Siriji sub-basins, and the Tracunhaém river with the Nazaré, Palmeirinha and Itapissirica subbasins.
It is important that the calibration methods do not allow the parameters to exceed physically significant maximum and minimum limits. Carefully determining the limits of each calibration parameter increases the accuracy and efficiency of the process (James, 2005). The first estimates of the model parameters should be as close as possible to the calibrated value. Thus, a realistic, physically significant range of model parameters must be identified.
The parameters analyzed in this study were: • Area ( ) and slope ( ) of the sub-basins -These parameters were obtained with GIS technology, but using a Digital Terrain Model (MDT) with a spatial resolution of 30 meters. Therefore, they are subject to errors, even if minimal.
James (2005), suggests a limit of ±2% in relation to the data measured for the area and of -12% to 29% of the data measured for the slope. ±5% for area and ±30% for slope was used, which are existing and selectable values within the model's resources.
• Runoff width ( ) -To systematically estimate the width of the sub-basin, the measured area of the sub-basin is divided by the length of the longest flow path. An uncertainty range of ±60% was assigned to this parameter, given the possibilities of error both in the area measurement and in the length of the largest watercourse.
• Percentage of impermeable area (%) -In relation to the impermeable areas of the basin, an initial value of 10% was assigned to all sub-basins with an uncertainty range of ±100%. All basins are predominantly rural, with small urban areas in percentage terms, no more than 20% of the sub-basin areas.
• Manning roughness coefficient of impermeable and permeable areas ( e ) -The initial values of this parameter were established according to the literature recommended by the PCSWMM. Due to the high variability of the Manning roughness coefficient for different types of soil cover, the limit and range of this parameter are very difficult to estimate. Therefore, a limit of ± 100% was adopted in relation to the initial value initially established, 0,030 for permeable areas and 0,015 for impermeable areas (Rossman, 2015).
• Depth of storage in the depressions of impermeable and permeable areas ( e ) -It is defined as the portion of water that is stored in depressions and exhausted only by evaporation. Its value varies according to the type of soil and the slope of the basin. The PCSWMM user manual recommends for impermeable areas a limit of 0,13 mm to 1,50 mm and for permeable areas 1,5 mm to 6,5 mm (Rossman, 2015). An uncertainty range of ±100% was also assigned and the initial value was the median of the values presented above.
• Curve Number (CN) -The initial value of the CN was attributed based on official data from the National Water Agency (ANA) on its "Metadata" portal. The range of uncertainty used was ±100%. For this parameter, the physically acceptable limits are from 0 to 100, where the model itself restricts its value in this range.
The observed time series files can be opened in the SRTC tool (after executing the sensitivity) or in the Graph panel (before starting the SRTC tool), in which case periods or ranges of values can be selected which will perform the calibration, decreasing thus computational time (Chi, Water, 2021).
In the SRTC calibration window, the radio tuning sliders (Figure 4) are displayed for each parameter, where it is possible to view the calculated effect of the adjustment automatically. The "Optimize" button performs the automatic calibration of each parameter, which is optimized based on the ISE (integral square error) or NSE (Nash-Sutcliffe efficiency) adjustment measures.

Statistical error analysis
The SRTC calibration tool, calculates various error statistics between time series with the same units in real time.
Statistics include, the classification of the integral square error, the value of the integral square error (ISE), Nash-Sutcliffe efficiency (NSE), coefficient of determination (R²), standard error of estimate (SEE), simple least squares (LSE), dimensionless simple least squares (LSE dim), root of the mean square error (RMSE) and root of the mean square error without dimension (dim RMSE).
In this study we will discuss and analyze the quantitative adjustments of the Integral Square Error (ISE), the Nash-Sutcliffe efficiency (NSE) and the determination coefficient (R²), in addition to the qualitative adjustment of the hydrographs observed and calculated by the model.

Integral Square Error (ISE)
The Integral Square error is a measure of adequacy between the observed and modeled responses. < 3,0 "Great".

Nash-Sutcliffe efficiency (NSE)
Nash-Sutcliffe efficiency (NSE) is a standardized statistic that determines the relative magnitude of the residual variance ("noise") compared to the variance of the measured data ("signal") (Nash;Sutcliffe, 1970).
The equation is as follows: ̅̅̅̅ is the average of the observed data.

Determination coefficient ( )
The coefficient of determination R² is a summary measure that tells how well the sample's regression line fits the data (Gujarati;Porter, 2009). It can be interpreted as the proportion of variance in the dependent variable that is predictable from the independent variable. An R² equal to 1 indicates that the regression line fits the data perfectly. Your equation is presented below.
̅̅̅ is the average of the calculated values.

Results and Discussion
The parameters of the PCSWMM model were analyzed for their sensitivity in relation to the peak flow in the sub-basins.
The depth of storage in the depressions of the impermeable and permeable areas ( and ), and the Manning roughness of the impermeable areas ( ) did not present any sensitivity to the model. Insensitivity to these parameters is mainly induced by the low proportion of impermeable areas and due to the predefined value limits for these, which are relatively narrow. Similar results of insensitivity of these parameters in the SWMM model were found by Krebs et al. (2013).
The CN (Curve Number) and the Manning roughness of the permeable areas ( ) in the sub-basins, were the parameters with the highest sensitivity, followed by the runoff width ( ) and the area ( ) of the sub-basins, all analyzed within the specified uncertainty range. Formiga et al. (2016), performed sensitivity analysis of parameters of the SWMM model followed by automatic calibration using an R-NSGA optimization algorithm coupled to the model, the most sensitive parameters were also the runoff width ( ) and the area ( ), in addition to the slope ( ) and the Manning roughness of the impermeable areas ( ).
The CN even changes the maximum flow by up to 65%. Its minimum value, after adjustment, was found in the Itapissirica sub-basin, in the southern region of the Goiana river basin, with a value of 17.14, corroborating with some results found by Regarding the quantitative adjustment measures, the Engenho Retiro, Caricé and Nazaré sub-basins, all with control points, presented the best values in relation to Nash-Sutcliffe Efficiency (NSE), with the respective values of 0,919, 0,899 and 0,884, these values have a performance classified as "very good" for Moriasi et al. (2015). The same sub-basins, in relation to the coefficient of determination R², also presented results considered "very good", with the following values: 0,919 for Engenho Retiro, 0,918 for Caricé and 0,89 for Nazaré.
In the validation process, the Engenho Retiro and Caricé sub-basins obtained an NSE of 0,649 and 0,711, respectively, results considered "satisfactory" and "good", respectively. The Nazaré sub-basin had an NSE of -5,03, therefore "unsatisfactory". Using a similar strategy, to perform the calibration locally by sub-basins and then do a global analysis through an optimization algorithm for automatic calibration, Gao et al. (2020) analyzed 57 sub-basins and in only one of them obtained a result considered unsatisfactory in terms of Nash Efficiency, with a negative value for it. In the validation analyzes, the R² maintained a satisfactory performance in the Engenho Retiro and Caricé sub-basins, and an unsatisfactory result for Nazaré, presenting an R² of 0,49.
The results presented in the Engenho Retiro and Caricé sub-basins, regarding the NSE and R², are consistent with the study prepared by Wang et al. (2020), which developed an automatic calibration model with genetic algorithm (GA) also applied in large basins.
The other sub-basins, Engenho Volta and Itapissirica, presented results, as for the NSE, "satisfactory" and "unsatisfactory", respectively. Regarding validation, the Itapissirica sub-basin obtained an NSE of 0,812, that is, better than the calibrated result. The Engenho Volta sub-basin had a "bad" result with the NSE value less than zero. The coefficient of determination for the same basins, were all above the limit considered satisfactory, both in calibration and in validation.
The behavior of the model when analyzed by the Integral Square Error (ISE) was considered good for the Engenho Retiro, Caricé and Nazaré sub-basins, the same ones that obtained good results regarding the NSE and the R², in the calibration process. Still in calibration, the Itapissirica sub-basin presented a "reasonable" result and only the Engenho Volta sub-basin obtained a "bad" result for this index. All NSE and R² results are presented graphically in Figures 10 and 11 and in table 05.
The validation being analyzed by this index, obtained a "good" result only in one sub-basin, Itapissirica, in the others the results were considered from reasonable to bad. This measure was also analyzed by Jaiswal et al. (2014), evaluating the performance of the automatic calibration and validation of 41 sub-basins, obtaining results at a level considered "Great" for this index in all sub-basins.
The correspondence between the hydrographs observed and calculated by the PSWMM model, can be seen in Figures   12 to 16, and in the validations of Figures 17 to 21. It is possible to notice that the calibrated flows, in general, are better adjusted to the observations (the model fits best in both the maximum and minimum points). The performance of the model for the series of validated flows is, in general terms, considered satisfactory, similar performance was found by Gao et al. (2020).
The best correspondence between peak flows is observed in the Nazaré sub-basin with a variation of only 5,11%, after the calibration process. Regarding the volume of total runoff, the Caricé sub-basin showed a 1,83% variation in relation to the same observed data. In all sub-basins, after calibration, the variation between the observed and calculated peak flows did not exceed 25%. The total volume of runoff, in 4 of the 5 controlled sub-basins, the variation did not exceed 6%. The other results regarding these criteria are presented in Tables 05 and 06. The results, after the calibrations, regarding the variations of the maximum flows and total flow volume, in 90% of the cases were within the limits recommended by James (2005). However, the author also argues that the calibration tolerances need to consider the inaccuracies inherent in the observed data and the objectives proposed in the hydrological modeling.

Conclusions
This study presented the sensitivity analysis tools and automatic calibration existing in the PCSWMM hydrodynamic model. The analyzes were performed in the hydrographic basin of the Goiana river in the State of Pernambuco. Observed flow data were used in five fluviometric stations present in the main streams of the basin.
The sensitivity analysis features of the PCSWMM model allow the hydrological parameters of the watershed to be analyzed for various model responses -maximum, minimum, average flow, total runoff volume -enabling an accurate assessment of the most sensitive parameters of the basin.
In relation to the statistical results (ISE, R² and NSE) and the qualitative assessment between the observed and simulated hydrographs, most of the sub-basins showed quite satisfactory results. However, in two sub-basins (Engenho Volta and Nazaré), there was a discrepancy between the calibrated and validated results, with results classified as "satisfactory" and "good" in the calibration (NSE = 0,508 in Engenho Volta and NSE = 0,884 in Nazaré) and "bad" in the validation process (NSE = -2,51 in Engenho Volta and NSE = -5,03).
The automatic calibration tool available in the PCSWMM model, was quite effective when compared to the trial-anderror calibration approach, which is usually quite time-consuming. In addition to allowing immediate assessment of the displacement of the hydrograph, from the change of the parameter to be calibrated within the range specified by the user (SRTC resource -Sensitivity-based Radio Tuning Calibration).