Design and development of statistical and based-on-Machine Learning techniques for top-5 European football leagues
Alvear González, Rafael (2023)
Alvear González, Rafael
2023
Master's Programme in Computing Sciences
Informaatioteknologian ja viestinnän tiedekunta - Faculty of Information Technology and Communication Sciences
This publication is copyrighted. Only for Your own personal use. Commercial use is prohibited.
Hyväksymispäivämäärä
2023-07-31
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:tuni-202307057091
https://urn.fi/URN:NBN:fi:tuni-202307057091
Tiivistelmä
Statistical techniques arose centuries ago with the aim of performing statistical analysis and making decisions based on observation. However, the growth of computational power and the storage of large amounts have turned statistical learning into machine learning. During the last decades, the irruption of Machine Learning has meant a revolution in Data-Driven Decision-Making for companies and entities of any industry or professional field, even in sports.
The goal of this research is to find the most competitive football league in Europe. To do this, a statistical analysis will be carried out from three different points of view and a predictive analysis based on the design and implementation of Supervised Learning models (algorithms). Throughout the project, a methodology typical of any data science process will be followed, including previous steps such as cleaning and data processing.
Analysis all combined, along with the interpretability of the predictive models and the prediction of results facing teams from different leagues, will allow us to determine that the Spanish La Liga, followed by the German Bundesliga and Italian Serie A, are the most complicated league for the foreigners teams, while French Ligue 1 is the easiest. In addition, LeagueXplorer application will be develop in order to let the user make predictions by his/her own.
The goal of this research is to find the most competitive football league in Europe. To do this, a statistical analysis will be carried out from three different points of view and a predictive analysis based on the design and implementation of Supervised Learning models (algorithms). Throughout the project, a methodology typical of any data science process will be followed, including previous steps such as cleaning and data processing.
Analysis all combined, along with the interpretability of the predictive models and the prediction of results facing teams from different leagues, will allow us to determine that the Spanish La Liga, followed by the German Bundesliga and Italian Serie A, are the most complicated league for the foreigners teams, while French Ligue 1 is the easiest. In addition, LeagueXplorer application will be develop in order to let the user make predictions by his/her own.