The influence of crosses, shots, corner kicks and defensive movements in the results of Premier League matches

The technology growth allowing analysis acts to gain more efficiency in football has increased, with performance analysis researches being conducted, since the knowledge of performance indicators that can determine success in football and the need for more predictive analysis to better comprehend them are essential. So, in account of these aspects this research aimed to investigate the influence of crosses, shots, corner kicks and defensive movements in winning Premier League (PL) football matches. The methodology adopted consists of a quantitative, descriptive and documentary research, which the sample is represented by all PL 2015/2016 matches. The analysis acts were made by the utilization of the association technique, present at data mining process, with the support of the software Weka. Results demonstrated the influence of variables in PL victories, suggesting that making a higher number of crosses than opponents was not effective to win PL matches, but that making more shots on goal than opponents was a positive indicative to predict victories, as having more defensive movements and less crosses than opponents as well. About corner kicks, results demonstrated that there is no influence of this component at match results.


Introduction
Consisting on a sport inserted in collective sports games, football is represented by a practice that is not only limited to footballers' actions on the pitch, but that also involves the availability and the subsequent application of footballers' information present in the football context, with the success achievement demanding this connection with the existent theory (Garganta, 1997;Wilson, 2013).
Since teams operate as dynamic systems, facing contextual issues that can become unpredictable and without happening order, every game variation relates to a singular match moment and connects with the success desire and with the excellence need at performance (Garganta, 2008;Garganta et al., 2000;Santos, 2012;Tani, 2001).
The technology growth allowing analysis acts to teams and players gain more efficiency and effectiveness in performance has increased (Afonso et al., 2020;Jones et al., 2020;Silva et al., 2020;Alves et al., 2021;Rocha-Lima et al., 2021), with many football performance analysis researches being conducted, since the knowledge of performance indicators that can determine success in football is critical, as the need for more predictive analysis to better comprehend them as well (Lepschy et al., 2018(Lepschy et al., , 2020. Among the acts we can mention, crosses, corner kicks, defensive movements (aerial duels, interceptions and tackles) and goals scored based on the total number of shots on goal (efficiency of the attack).
About crosses, at first instance they can be defined as a ball sent by a footballer into the opposition team's area from a wide position on the football pitch (OPTA, 2018). The crossing occurrence, connected to scoring goals opportunities (Flynn, 2001;Hughes, 1990;Kuper & Szymanski, 2014;Pulling et al., 2018), to make part of a game model (Perarnau, 2017) and to match results (Fernandez-Navarro et al., 2018;Lepschy et al., 2020;Zhou et al., 2018), has already been previously considered in literature and support the data discussion present in this article.
The corner kicks, in their turn, assigned to the attacking team when the defending team makes the last ball contact before it leaves the pitch through the bottom line (Luongo, 1996), have been already treated by authors, which associated the corner kick execution with match results (Anderson & Sally, 2013;Rocha-Lima, 2018), league table positions (Gollan et al., 2018;Souza et al., 2019), defending strategies (Kubayi & Larkin, 2019;Pulling & Newton, 2017), areas and how to deliver the ball to the area (Beare & Stone, 2019;Pulling, 2015;Strafford et al., 2019), furthermore goal scoring predictions (Anderson & Sally, 2013;Pulling, Robins, & Rixon, 2013;Taylor et al., 2005), which also give support to discuss the data that are going to be presented in this research.

Methodology
The methodology of this article was characterized as a quantitative, descriptive and documentary research. The approaches of this method involve a strict variable control with the utilization of precise measurements, focusing on objective analysis of a large database components (Thomas et al., 2012). The descriptive aspect aims to describe certain phenomena characteristics or relations among investigated variables (Gil, 1994).

Material
To conduct this research, data from the 380 PL matches, the first division of the English professional football, of the 38 matchweeks played at season 2015/2016, represented the material of this research to attend the determined objective. Of these 380 matches, 157 matches were won by the home teams (41%), 107 ended in a draw (28%) and 116 were won by the away teams (31%).

Data Collect
At first instance, it is emphasized that only statistical data from the matches played in the PL at season 2015/2016 are needed, so it was not necessary to submit this research to an ethics and research committee approval. To obtain the desired data, accesses to Daily Mail website were made, specifically in the 'sport' topic, which is possible to access the information about PL.
Sequentially, the subtopic 'RESULTS' was selected, in order to verify the matches results in all matchweeks and to enter in each match zone, to extract the relevant indicatorscrosses, corners, shots on target, goals based on total shots, match results and also defensive movements, like tackles, interceptions and aerial duelsto be analyzed in this research.
Posteriorly, the match zone of each PL 2015/2016 match was accessed to save all match website pages as text files, with ".txt" extension, to make the matches indicators obtaining possible. In order to facilitate the data transfer from a spreadsheet to the data mining software Weka, that will be detailed in the next topic, the naming of each file was made following a pattern, avoiding possible reading failures. So the adhered pattern consists of this format: "Match-(matchweek number, that varies from 1 to 38) -(match order in each matchweek, that varies from 1 to 10).txt. File names like "Match-1-7.txt", and "Match-13-04.txt", for example, clarify the methodological strategy used.

Instruments
Since the matches statistical data were available in ".txt" files, a computer program, in programming language "Perl", was developed to extract the relevant indicators of those matches. To meet this affirmation, it is reinforced that finding the indicators is important to the adequate research variables measure, even as selecting their classifiers to the data analysis step (Gil, 1994). This developed computer program was able to automatically analyze the data, and it also generated a new file in a format called pattern ".csv", which data appeared separated by commas. This ".csv" file was utilized with a spreadsheet applicative, the Microsoft Excel, and also with the data mining software Weka (Weka, 2020).
The spreadsheet applicative was used not only to facilitate the data visualization, but also to make preliminary analyzes of collected data, to generate additional information about the extracted data and to adequate the stored numbers. The use of the software Weka, in its turn, allows the data mining techniques use, like association techniques, in order to obtain pertinent relations among the collected data. Research, Society andDevelopment, v. 10, n. 16, e477101624072, 2021 (CC BY 4.0) | ISSN 2525-3409 | DOI: http://dx.doi.org/10.33448/rsd-v10i16.24072 4

Classifiers Determination
In the first instance, it is important to emphasize that classifying data represents a relevant component of the analysis process, since it is a facilitating agent to the Weka's techniques execution and to the results understanding (Rocha-Lima, 2018). So, in order to make the classifiers determination, the previously mentioned Excel spreadsheet was utilized, with resources connected to logical expressions, like the IF logical test, that allowed the classifiers attribution to each extracted indicator.
The classifiers were created in function to the indicators extracted and proposed by Rocha-Lima (2018). So, the match results classifier received 3 possible concepts: A, home team victory; B, draw; and C, away team victory. Even as the classifiers of indicators of crosses, shots on target, corner kicks, tackles, interceptions and aerial duels also received 3 possible concepts: A, home team demonstrated higher value of the indicator than the away team; B, home and away teams demonstrated the same value of the indicator; and C, away team higher value of the indicator than the home team.
Complementarily, the classifier referring to the number of goals scored related to the total shots volume also received 3 concepts as well, but they were subsidized by Hughes (1990) appointments, who pointed that a goal is scored in every 9 attempts, in other words, representing a conversion ratio of 11%. So, it was proposed by Rocha-Lima (2018) the following concepts: 'A', if the amount of goals divided by total shots volume was greater than or equal to 0,11; 'B', if this goals conversion ratio was between 0,06 and 0,11; and 'C', if this conversion ratio was lower than 0,06.

Data Analysis
Since all data of interest of the 380 PL matches were collected and the classifiers were already defined and stored in the ".csv" file, the next step consisted of analyzing these data in order to identify relations between the variables and the teams' success in the PL matches. To carry out this step, the software Weka (Weka, 2020) was chosen as an auxiliary analyze instrument to use the association technique, that allows association rules generation between variables and classifiers analyzed.
The association technique is characterized by association rules (Besemann et al., 2004;Oliveira et al., 2004;Zheng et al., 2001), that can be represented by a implication hypothesis in the format "A → B with probability P" (Agrawal & Srikant, 1994;Yang, 2005), what describes the occurrence of B, dependent of A, with probability P for this occurrence, with support and confidence as most used measures (Weiss & Zhang, 2003), being the first measure the frequency of the relations and the second measure the strength of the observed relations.
So, Weka was utilized considering the ".csv" file, what produced a large association rules set that was analyzed aiming to verify the contributions to the objective of this research, characterizing the data exploitation process, that prioritizes making the information valid, confronting them with existent appointments and possibly getting to broad and effective generalizations (Gil, 1994).

Results
According to Weka analyzes, home teams making more crosses than away teams do not lead these home teams to a high winning matches percentage at PL. However, in matches which home teams make less crosses than away teams at PL, home teams reach a higher winning percentage in comparison with the first variable mentioned in this paragraph ( Figure 1).
On that way, Figure 1 results demonstrate that in matches which home teams made more crosses than away teams, home teams presented a winning occurrence of 32%, while in matches which away teams registered more crosses made than home teams, home teams won 57% of their matches. Complementarily, no results were found by Weka for away teams winning percentages in these variables. Research, Society and Development, v. 10, n. 16, e477101624072, 2021 (CC BY 4. Considering results observed about total corner kicks influence at final match results, it can also be seen at Figure 2 that home teams won 40% of their PL 2015/2016 matches when they gained more corner kicks than away teams, but achieved a higher winning percentage (46%) in comparison with the first variable mentioned in this paragraph, when the same home teams gained fewer corner kicks than away teams.
Those results suggest that beyond making more crosses than opponents, gaining more corner kicks than opponents is not also a good component to predict a high number of victories at PL. About winning percentages for away teams, Weka did not find any results.  6 Taking into account the data presented at Figure 2 by Weka, Figure 3, sequentially, indicates additional winning percentages also conceded by Weka, associating total shots, shots on goal and corner kicks with match results. According to results appointed at Figure 3, in matches which home teams made more shots and gained more corner kicks than away teams, home teams won 45% of their matches. In addition, this winning percentage for home teams was even higher (53%) when they made more shots on target and gained more corner kicks than away teams.
Those results suggest that associating corner kicks with shots increases the winning likelihood, with shots on target being more interesting to be explored than total shots. Moreover, no results were found by Weka about away teams winning percentages in those variables. Besides, results suggest that shots on goal represented a good component to be explored in order to predict victories.
For home and away teams, this component influenced the match final results, indicating that making more shots on goal than opponents increased the winning likelihood. Ultimately, total corners, isolated, did not influence the match final results, as pointed at Figure 2, but when associated with more shots made or more shots on target than opponents, corners did guarantee a higher winning percentage, specially associated with shots on target, as pointed by Figure 3.
Proceeding the analysis, in matches which total crosses and total shots on target are associated, as it can be seen at Figure 4, home teams won 48% of their matches when they made more crosses and shots on target than away teams, but the same home teams increased their winning likelihood to 83% when they made more shots on goal and less crosses than away teams. Away teams, by their turns, achieved 70% of victories of matches which they made more shots on goal and fewer crosses than home teams (Figure 4).
Those results suggest that making more crosses than opponent teams did not represent a variable to be explored in order to win PL matches, regardless of being home or away team. Besides, results suggest that making fewer crosses than opponents was more recommendable to reach victories than making more crosses than opponents. Thus, it can be suggested, based on those results, that making fewer crosses and more shots on target than opponents represent a relevant variable to be explored in order to increase the winning likelihood at PL. Research, Society and Development, v. 10, n. 16, e477101624072, 2021 (CC BY 4. Treating the association between total crosses and defensive movements (aerial duels, interceptions and tackles), results presented at Figure 5 suggest that in matches which home teams made fewer crosses and won more aerial duels than away teams, home teams won 64% of their matches and that away teams, in matches which they made less crosses and won more aerial duels than home teams, away teams won 35% of their matches.
Besides, Figure 5 demonstrates that teams which made fewer crosses and more interceptions than opponents, they won 68% of their matches when playing at home and 37% of their matches when playing as visitors. Ultimately, Figure 5 suggests that in matches which teams registered fewer crosses and more tackles than their respective opponents, they won 60% of their matches when playing at home and 40% when playing as visitors. These findings reinforce the conception that crosses did not influence the match final results at PL 2015/2016. Research, Society and Development, v. 10, n. 16, e477101624072, 2021 (CC BY 4.0) | ISSN 2525-3409 | DOI: http://dx.doi.org/10.33448/rsd-v10i16.24072 8 Figure 5. Winning occurrence associated with defensive movements and total crosses. Source: Authors.

The lack of effectiveness of excessive crosses and the shots on target relevance for winning achievement
At first instance, results presented in Figures 1 and 2 that indicates the use of excessive crosses as an inappropriate alternative to be explored in order to win football matches at PL, are supported by the literature, with authors arguing that losing teams had significantly higher averages in crosses made, which are connected to a drop in the winning likelihood and consequently, to failure (Lepschy et al., 2020;Zhou et al., 2018). As an example of international competition that evidences the lack of effectiveness of excessive crossing movements, in the FIFA World Cup at 2014, from a total of 1332 open play crosses observed of all 64 matches of the mentioned competition, only 3,2% -42 goalsresulted in goals scored (Pulling et al., 2018).
As an example of football club that demonstrated the inefficiency of excessive crosses, in its turn, the english Liverpool Football Club, for example, at PL season 2011/2012, based their game model in a crossing strategy looking forward their striker Andy Carroll, signed along with Jordan Henderson and Stewart Downing to maximize this strategy, since the first footballer was tall and would provide aerial advantage, besides scoring goals through headers, furthermore the two remaining footballers were responsible for considerable offensive production of their previous teams, but the expectations created resulted in failure, since only one goal was scored in 421 open play crosses made (Kuper & Szymanski, 2014).
The same authors, referred in the last paragraph, reinforce that is quite predictable to defend and to take advantages against teams that is already known what they are going to do in the matches, in other words, since they count with a tall striker and two good footballers at crossing, furthermore are not capable to find other alternatives to play than crossing the ball to the area to try to score goals. That way, it is argued that a crossing from a set piece can also make sense, since there is available time and space to be precise in the movements made, but ask to footballer to run by the flanks to cross the ball all the time, followed by a defender, is a waste of time and effort (Kuper & Szymanski, 2014).
Is pertinent to emphasize that a lower likelihood of triumphs is existent through an excessive utilization of crosses, since this alternative limit the game model and consequently the application of other alternatives that can present themselves as Research, Society andDevelopment, v. 10, n. 16, e477101624072, 2021 (CC BY 4.0) | ISSN 2525-3409 | DOI: http://dx.doi.org/10.33448/rsd-v10i16.24072 9 more pertinent to solve match problems, like penetrative passes (Tenga, A., & O'Donoghue, 2017) and shots from medium and long distance, for example (Rocha-Lima, 2018). However, it is also relevant to reinforce that the problem of the crossing alternative is in the excess. Crosses can present themselves as adequate attacking solutions in certain match moments, so they can be trained to provide more efficiency, through positioning footballers and the ball in the most favorable areas to score goals (Rocha-Lima, 2018).
Despite is mentioned that the increase in efficiency to score goals through crosses is not frequently reported, even if they represent a tactical solution for this purpose (Flynn, 2001), literature collaborates to these results discussion with shared knowledge about how to improve the crossing efficiency. It is argued in literature that 64% of goals scored through crosses are finished by headers, as 22,4% of goals coming from crosses are scored next to the far post as well, seen as the main area to score goals, since 4 of 5 goals coming from crosses are scored in this area (Hughes, 1990).
Pep Guardiola, in his turn while Bayern München football coach, worked to organize his team structure to maximize the actions of his wingers, in order to allow them to get close to the area and to make low and strong crosses near to the first post, looking for finishes by the strikers, a mistake made by the opponent defenders that could result in an own goal and also to gain second balls, advancing his midfielders to generate numerical superiority in the final third of the pitch, recover the ball close to the opponent's goal and quickly score a goal (Perarnau, 2017), evidencing a good suggestion to maximize crosses efficiency.
Moreover, the fact of presenting more shots on target than the opponents, as it can be seen in figure 2, not only evidences the relevance of this indicator to achieve a high amount of triumphs, but also meet the literature data which support its relevance, as the venue influence as well (Anderson & Sally, 2013;Rocha-Lima, 2018).

The influence of venue and defensive movements at match winning
Sequentially, the mention of venue in the last paragraph makes a pertinent connection with results presented in Figure   3., since from this figure, besides venue, another main aspect stands out and shows itself as an important component for these data discussion: performing better defensive movements than opponents when playing at home.
Analyzing the data from Figure 3, becomes clear that performing defensive movements at home, better than the respective opponents, it is an important variable to be explored, but above all, venue demonstrates itself as being the most relevant indicator responsible for these predictions and recommended to be tapped, since visitor teams have registered poor winning percentages in comparison with home teams predictions.
On that way, authors have argued that venue and its respective advantages are already being investigated into the sport context (Legaz-Arrese et al., 2013). It has also been suggested that venue advantage is demonstrated when teams win over 50% of their home matches played, with teams' quality appearing to have an important influence in this mentioned advantage (Rooney & Kennedy, 2018).
Moreover, literature even reinforces that match venue is seen as an relevant aspect on defensive and offensive teams' performance, considering that best teams defend more consistently, do not depend of too many goalkeepers interventions, furthermore improve performance in function of various aspects when playing at home, such as supporters presence, for example, that can rise footballers' aggressive reactions and precision at defensive movements (Almeida et al., 2014;Lago-Peñas & Lago-Ballesteros, 2011;Mackenzie & Cushion, 2012;Sarmento et al., 2014).
In addition, authors also affirm that venue have presented significant effect in all play styles at Premier League 2015/2016 season (Fernandez-Navarro et al., 2018), beyond generating a predictive winning probability of 42% when playing at home and a predictive winning probability of 27% when playing as a visitor in a previous research (Anderson & Sally, 2013). Research, Society andDevelopment, v. 10, n. 16, e477101624072, 2021 (CC BY 4.0) | ISSN 2525-3409 | DOI: http://dx.doi.org/10.33448/rsd-v10i16.24072 10 Sequentially, the association between playing at home and defensive movements presenting a high winning percentage can be justified by the fact that not only maximizing attacking movements is enough to guarantee high success probabilities, but also improving in defense is relevant in order to achieve large quantities of triumphs, what is even supported by the literature.
Authors emphasize that defensive errors demonstrated high influence in performance estimates, as home advantage is an important context effect as well (Lepschy et al., 2020). Moreover, authors also argue that best teams presented themselves as faster in ball recoveries than teams considered as inferior and in losing match situations (Vogelbein et al., 2014), besides the fact that teams which had less than a half of matches balls lost won 44% of the matches, against 27% of triumphs achieved by teams which had more balls lost (Anderson & Sally, 2013).
Complementarily, results presented in Figure 3., despite of the balance among winning occurrence percentages, indicate interceptions, connected to playing matches at home, as the most interesting defensive movement to be explored in PL, what is also supported by the literature. Authors consider ball recovering as a disruptive event that provide imbalance moments at match happenings and in the teams' organization, creating then, transition moments from defense to attack (Barbosa, 2014;Barreira et al., 2014).
By the way, interception is defended as the most advantageous ball possession recovery way (Barreira et al., 2014;Garganta, 1997), furthermore the most executed one, from wrong opponent passes and then, are followed by tackles (Claudino, 1993), which are the most used defensive movements by medium teams (Almeida et al., 2014). Moreover, by the fact that direct play alternatives and having less ball possession demonstrate to be an interesting alternative no be explored to win football matches at PL (Rocha-Lima, 2018), it is expected that defensive approaches tend to be well performed in this context, what is also treated by other authors either.
It is reinforced that between 1999 and 2019, overall offensive play has decreased as defensive play has increased as well, with the first approach being more associated with success than the second one (Lane, van der Ploeg, Greenham, & Norton, 2020), what emerges the need of pointing that is possible to be offensive in football without dominating components of indirect play, such as having more ball possession and total passes than opponents.
On that way, authors conclude that not only offensive dimensions of football, but also defensive and contextualvenue, match status and opponent's qualitydimensions have an important impact on determining shooting effectiveness in European football (Gonzalez-Rodenas et al., 2020), ball recovery location, besides defensive and offensive positioning lines (Santos et al., 2017) as defensive effectiveness against finishes conceded and ball recoveries volume were considered the defensive variables with highest association with points earned at Spanish first division football league as well (Souza et al., 2019).

Effect of corners at match results
Figures 4. and 5. demonstrate that: having more corners than opponents do not guarantee a large number of wins achieved; goals scored and venue are much more relevant to obtain a higher winning percentage than corners; furthermore, the association among having less corners than opponents, playing at home and reaching concept A on goal conversion based on total shots, seems promising to be explored. On that way, literature counts with authors which argue that number of shots and number of corners were the attacking variables with highest connection to points earned during a season (Souza et al., 2019), even as there are authors that concluded high ranked teams demonstrate control of established offence and set pieces (Gollan et al., 2018).
Complementarily, in order to support even more this research results, authors also reinforce that getting more corners than opponents do not lead to a high winning percentage, since in 1434 corners occurred, it was verified that only 20,5% led to a finishing movement (Anderson & Sally, 2013). At PL, season 2001/2002, only 6 goals were scored from 217 corners (Taylor et al., 2005). In the same PL, but at season 2011/2012, only 4,1% of 436 corners turned into goals, as in the Major League Soccer of 2010, in its turn, only 2,2% of 1859 corners were converted in goals scored as well (Pulling et al., 2013).
However, despite of these low conversion in goal ratios of corners, authors also alert, in their studies, for possibilities to be explored to maximize corners utilization, since even though their conversion ratio in goals can be low, goals can be scored from them if they are put in more susceptible areas to score. Though in this research only 9 goals were scored from 328 corners, there was noticed of the existent possibility of a higher successful probability to score goals through corners since the ball is delivered close to the penalty mark, considering 4 of these 9 goals were scored in this area (Pulling, 2015).
In addition, despite of only 4,6% of 824 corners resulted in goal in this research, it is reinforced that delivering corners in the central area of the 18-yard box increased the number of attempts on target, moreover delivering the ball in the central zone of the area, but closer to the goal line, led to a higher probability to score goals (Beare & Stone, 2019). The literature also provides knowledge about offensive and defensive strategies into corners execution. It is affirmed that goals were scored from corners through dynamic attacking organizations, with two defenders on the posts and the score line was level (Strafford et al., 2019).
Despite of only 22 goals scored of 600 corners at FIFA World Cup 2018, it is explained that most of the goals scored came from the center of the area and the first post, furthermore through inswing corners and when opponent teams were adopting a zonal marking strategy (Kubayi & Larkin, 2019). As mentioned in the open play crosses results and discussion as well, the failure consists in the excess, in having more quantity of these movescrosses and cornersthan the opponents, but working in order to make these same moves more precise, looking for the most promising areas and ways to deliver the ball and then, to have more chances to score goals, besides defending more properly, can be an useful alternative to be explored, based on the results presented and their discussion accomplished.

Conclusion
Since this research aimed to investigate the influence of crosses and corners in winning PL football matches, it is possible to conclude that registering higher numbers of both indicators than opponent teams do lead to a low winning percentage at PL.
Considering the results presented, it is recommendable to register lower numbers of crosses and corners than opponent teams and to combine these numbers with having more shots on target or concept A of goals scored based on total shots in order to achieve higher winning percentages.
Beyond shots on target and goals scored based on total shots effect at winning probabilities, venue also demonstrated to be a relevant aspect in order to guarantee high winning probabilities as well. Results have indicated that playing at home, connected to lower quantities of crosses or corners than away teams, added to a higher number of shots on target or concept A of goals scored based on total shots, represents another relevant variable to be explored at PL context in order to achieve more matches won.
Ultimately, the complicated data obtaining of other professional leagues matches, in order to make more comparison acts and to verify if there are differences or similarities among the patterns presented in this article in other scenarios, emerges as a limitation of this study. By the way, including classifiers referent to the quality of the football league teams in the acts of analyze, with all the match data available and also investigating more than one league in the same study, from different countries, represent relevant suggestions for the development of future researches, since more patterns would be identified in order to help even more teams to improve their performance.