Losses in water distribution networks – a bibliometric review: general aspects and optimization

The research carried out in the water distribution networks is of great importance, given the social, environmental and economic impacts that have occurred due to the scarcity of water resources. Therefore, any scientific effort shown in research that studies water distribution systems is of great relevance. Techniques such as mathematical modeling, computer simulation and statistical methods are widely used in order to obtain more reliable answers, whether for the identification of the current situation of the network, as well as for the prediction of scenarios, failure events, increased demand, etc. The objective of this work is to carry out a bibliometric analysis to identify the state of the art of research that addresses the theme of water distribution networks for the control and reduction of the volume of water losses, which will serve as a guide for future works to to structure itself in the most relevant researches that study the theme. The developed methodology was able to analyze a metadata composed of 4188 documents taken from the Web of Science journals database. As a result, a geographical view of the theme was obtained, pointing out the main countries, affiliations, journals and researchers, as well as pointing out the main documents and relevance of the theme. It can be concluded after the results obtained that bibliometric analysis is an important tool for obtaining the state of the art. With it is possible to have a better understanding of the current situation in the development of research, familiarizing researchers with what is most current and relevant.


Introduction
The existence of human life is connected to water consumption. Actions aimed the rational use of this resource, therefore, are required, considering its finiteness and an increasingly evolving demand linked to population growth (Shekofteh, et al., 2020).
For human consumption, water needs to have features that enable its use and, after necessary treatment, its supply to the population occurs from distribution networks in several forms, having in mind that the most usual of them works through the action of gravity. Water distribution systems need to meet regulatory requirements regarding potability, impermeability and pressure levels (Kerwin & Adey, 2020).
Due to the complexity of the water supply to the population, usually associated with poor management, geographical challenges, or even assaults, several failures arise in treated water distribution systems. Regarding this, it becomes useful to provide researchers and decision makers involved in the topic an overview of the scientific development in the area (Liu & Lansey, 2020). According to (Shamseer, et al., 2015) systematic's reviews are of great importance as they point to the future for researches themes addressed by the method.
Thus, the present research has, as general aim, to perform precisely a bibliometric analysis with the purpose of presenting the state of the art about the topic of water distribution networks.
In association, the general aim was structured considering the following specific aims: • Collect data from authors, documents, journals, affiliations, countries and themes in the scientific journal base Web of science (WOS); • Process and handle the imported metadata; • Calculate the bibliometric indicators to analyze collected data (authors, documents, journals, affiliations, countries and themes); • Analyze the measured bibliometric indicators.
It must be emphasized that technological advances and the expressive increase concerning the amount of research performed requires more than the experience of the researcher to structure systematic reviews on a given research theme. In this sense, the adoption of quantitative analysis of the production of a given area, based on bibliometric indicators, is gaining more and more space as it expands the possibility of providing an overview of a theme (Aria & Cuccullo, 2017). Research, Society andDevelopment, v. 10, n. 12, e407101220659, 2021 (CC BY 4.0) | ISSN 2525-3409 | DOI: http://dx.doi.org/10.33448/rsd-v10i12.20659 3

Methodology
The bibliometric analysis structured as a research method was organized in a segmented structure adapted from the PRISMA-P method respecting its protocols and constituted of 3 steps, "Data import", "Data transformation" and "Data modelling and mapping", the organization chart presented in Figure 1 represents the respective ones. At the stage 1, the focus was on locating documents concerning water distribution networks. To obtain these documents, it was used The Web of Science, one of the largest journal query database. As descriptors of the theme the keywords used were: "Water distribution networks", "Water losses", "Water loss control" and "Leakage water".
Initially, the metadata was obtained from each of the journals bases without applying any kind of filter. Considering this, the samples of documents were initially composed of several types of documents, articles, books, book chapters, congresses and congress papers, among other documents; along with such sources, all the information available about the obtained documents as a result of the search was used, that is, bibliographical information, abstracts, quotes information, keywords and other information.
The second stage is initiated from the import of the samples with the metadata obtained from the WOS base, in plain text format, to the R software for the purpose of implementing treatment and analysis. It is important to highlight that the opensource programming language R is widely used in statistics, and has tools for implementing bibliometric analysis, from the Bibliometrix package (Aria & Cuccullo, 2017). The import occurred by applying the convert2df function.
Still regarding the second stage: as a result of the previous process, a single matrix was organized in which the columns are segmented by different information about the documents, arranged in rows. Its access is available in the supplementary material of this article.
As the last process of the stage 2, a filter was applied so that the matrix was composed only by articles written in the English language. This filter is necessary because it allows the standardization of citations and encoding, which in turn makes possible the calculations of the bibliometric indicators related to the next stage of the research method. A last filter was applied in this stage and it is limiting the final publication period of the articles to the year 2020, which is the last complete year until Research, Society andDevelopment, v. 10, n. 12, e407101220659, 2021 (CC BY 4.0) | ISSN 2525-3409 | DOI: http://dx.doi.org/10.33448/rsd-v10i12.20659 the date of implementation of the present research. For the application of such, functions from the Dplyr package (Wickham, et al., 2021) were used.
Thus, the end of the second stage corresponds to the final matrix, containing the metadata of all articles from the base.
It allows subsequent implementation of calculation and analysis of bibliometric indicators and corresponds to stage 3 of the research method. An overview of the organized sample of articles is presented in Table 1. 5 frequencies of author's keywords over time were also analyzed, also as a proxy of the advancement of the most worked on themes nowadays.
Subsequently, the authors in the area are estimated both for productivity and impact criteria. For the productivity analysis, an overview was started from the analysis of the distribution of publications by Lotka (Lotka); the amount of publications and the amount of publications divided by the amount of authors of each publication were also calculated. These last two productivity metrics were calculated only for the authors with the highest research impact, measured by that of the authors with the highest H-Index and, in a complementary manner to the impact analysis, the amount of local citations by each of the selected authors were also calculated, that is, the amount of citations that the authors received from other 4,188 authors in the selected sample, as a proxy for citations from the authors necessarily from the same area.
Next, the articles with the greatest impact in the area are estimated. In this case, it was used as an impact criterion the amount of local citations and, in a similar way, the amount of citations that the articles received from the other 4,188 articles in the selected sample, as a proxy of relevance within the theme. In a complementary manner, the total amount of citations the articles received is also presented.
At the same time, the most relevant sources for the area are estimated by the impact criterion of the respective publications. For this, once again the H-Index of each journal was used as an impact criterion. Next, the amount of publications of each journal over time is presented.
Finally, the contribution of the authors' affiliations and countries, based on the amount of publications as a proxy for productivity. Considering the countries, an evaluation of international collaboration is also added based on the measurement of the amount of articles with authors from multiple countries. All the bibliometric indicators described above were implemented from functions of the R bibliometrix package (Aria & Cuccullo, 2017).

Conceptual Structure Analysis
The thematic map presented in Figure 2 is a qualitative analysis that demonstrates the relevance of the themes addressed. In the upper right quadrant are found the most central and most dense themes considered as the driving themes of the research. The upper left quadrant points to less central and denser themes, where very specific research is conducted with little production volume. The lower left quadrant points to the less central and less dense themes, which are very common and with little relevance themes. Finally, the lower right quadrant shows themes that are very central but not very dense; these are common themes in which there is a large volume of publications due to the simplicity of the theme.

Authors' keywords for Word growth generation.
Figure 3 presents the development of the theme according to the keywords by the authors from the year 1990. These are themes that focus on water distribution networks such as management, design, model and optimization, which mostly aim at a more efficient control of the system, searching a balance between system resilience and operational cost. Drinking-water and water, on the other hand, are generally linked to potability research. Other themes are used for the description of the research. It is interesting to note that all themes have their development lines, with an exponential growth rate, with no indication of a turning point.

Authors Analysis
First, in order to have an overview of the authors' productivity in this field of study, it was estimated the adjustment to lotka's law. By applying the Kolmogorov-Smirnoff two-sample test, a p-value of 0.12 was calculated, confirming that there is no significant difference between the observed and theoretical distributions by Lotka.
As a way of measuring the scientific production of researchers, the h-index created by (Hirsch, 2005) is applied. Table   2 identifies the top ten authors according to the h-index, the total amount of published articles, the proportion of fractionalized Documents, which is basically the percentage of participation of a researcher in each published article and the total amount of local citations that are received by researchers in related fields. It is worth mentioning that, in some criteria, the order of authors could be different, however, the h-index is the most widely accepted among the scientific community.

Documents analysis
The Table 3 arranges the top 10 articles according to local citation. It is interesting to note the year of accomplishment of each of the top 10 cited research that has 1977 as the oldest and 2013 as the most recente year. The research developed by (Alperovits, 1977) aims to optimize the water distribution system, as well as (Prasad, 2004), (Simpson, 1994), (Dandy, 1996), (Maier, 2003), (Araujo, 2006) and (Ostfeld, 2008), who used the genetic algorithm in its most diversified variations to achieve the proposed optimization objective. These are not contemporary researches, but they are until today one of the main researches related to the subject of optimization in water distribution networks, being the researches with the highest amounts of local citations received. Considering the most recent research conducted by (Ostfeld, 2004), it addresses the issue of water potability as well as (Rossman, 1994) and (Ostfeld, 2004).

Sources analysis
The Figure 5 below shows the evolution of publications of the journals with the highest amount of publications. One can observe a significant growth of the Journal of Water Resources Planning and Management, especially from the second half of the 1990s, being currently the journal with the highest amount of publications.
In the opposite direction, the journal by the American Water Works Association, which presented the highest productivity until the first half of the 2010s, ended up reducing its participation in the area also because of the Water Research and Water Resources Management journals. The journal by the Water Research has the best h-index among the 10, however, its volume of published articles is lower than the journal mentioned above and is, therefore, the second most relevant.
Analyzing the other journals presented in the Table 4, it is possible to notice that there is a significant variation between their indexes; the same happens with the two most important, however, their amounts of article publications follow a logical order and as they are the main factors analyzed here, they determine the classification of the journals mentioned.

Affiliation analysis
The Table 5 lists the top 10 affiliations that publish the most papers on the topic concerning water distribution networks.
The University of Adelaide, in Australia, has the highest amount of scientific output, with a total of 119 published articles, followed by the University of Exeter, in Australia, with 107 publications. The other Universities are listed and one observation has to be made. The Research on the subject of water distribution networks is being carried out all over the world, as can be seen by the affiliations listed in the top 10, with countries in Asia, Europe, America and Oceania.

Countries analysis
The Table 6 shows the 10 countries with the highest amount of published articles.
Analyzing the Table 6, it is evident the importance of research on water distribution networks. These are researches about a finite natural asset and of vital importance to human life and that is why they are developed around the world, with the American continent, being the U.S. the country with the highest amount of developed researches published. Asia also plays a prominent role with the Australian affiliations being the second largest country in volume of publications. Other countries around the globe also address water distribution networks in their researches, which meets the need to preserve the natural resource vital to human survival. Source: Authors.
The Figure 6 shows the documents by countries where there is more than one author per article and that can be of the SCP type, corresponding to publications in which all authors are of the same nationality and the MCP in which only one of the authors is of another nationality.
Analyzing the Figure 6, it is perceivable that the USA is the country that publishes the highest amount of articles, however, it is not the country with the highest rate of collaboration, staying behind Germany, which has the highest collaboration among the 10 countries listed and produces more articles with researchers of other nationalities.
Analyzing the dynamics of the other countries, it is evident that the publication of articles in the SCP pattern is higher than in the MCP pattern. However, research is being carried out in all continents, which becomes another form of collaboration.

Conclusion
As mentioned before, water is a finite and vital resource for human existence, therefore, any way to avoid wasting it is of utmost importance. In this sense, the studies that approach water distribution networks are necessary.
In the present work, a bibliometric analysis was applied to the metadata obtained from the Web of Science journal database. The objective of this analysis was to point out the current state of the art, giving an overview of the research that studies water distribution networks.
The metadata obtained is composed of 4188 documents in the time span, from 1946 to 2020. It was found that, between 1946 and 1990, there was little effort denoted by research studying water distribution networks. The scenario begins to change from the early 1990s, a period in which the theme starts to gain greater proportions of conducted research, however, it is worth mentioning that even with a growing scientific production the theme still shows a low average of citations of articles per year, indicating that more research that study the theme needs to be conducted.
Geographically, it was found that research on the theme studied here is being conducted all over the world, and the most relevant affiliations are not in the same continent, which again indicates the diffusion of the theme around the world. As well as the affiliations, the authors are also of different nationalities, being from countries such as Israel, Australia, the United Kingdom, Canada, and other countries, consolidating the importance of the subject. The journals are also from countries from different continents, thus corroborating the fact that the subject of water distribution networks is of the utmost importance and common sense.
The bibliometric analysis also addressed the relevance of the themes that focus on water distribution networks, pointing out matters that begin to fall into disuse and themes that begin to emerge and are consolidating as denser and more central ones.
It can be concluded, after the results obtained, that a bibliometric analysis is an important tool to obtain the state of the art.
With it, it is possible to have a better understanding of the current situation in the development of research, making researchers more familiarized with what is most current and relevant.

Suggestions for future works
• Perform bibliometric analysis with data from more journals queries; • Perform a bibliometric analysis only with articles in portuguese and compare their indexes with those articles in english; • Conduct a systematic review with the effectiveness of applied statistical models; • Conduct a quantitative survey of how many surveys developed in real networks were actually applicated.