Ice hockey checking detection from indoor localization data
Parto, Juha-Pekka (2021)
Parto, Juha-Pekka
2021
Tietotekniikan DI-ohjelma - Master's Programme in Information Technology
Informaatioteknologian ja viestinnän tiedekunta - Faculty of Information Technology and Communication Sciences
This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.
Hyväksymispäivämäärä
2021-03-09
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:tuni-202101171390
https://urn.fi/URN:NBN:fi:tuni-202101171390
Tiivistelmä
The purpose of this thesis was to perform a preliminary investigation of detecting bodychecks between ice hockey players automatically based on indoor localization data. The objectives of the thesis were to create a bodycheck dataset, train a machine learning algorithm with the dataset and evaluate the performance of the algorithm on full match runs. The location data was obtained from the Wisehockey sport analytics platform. The bodychecks of fourteen professional ice hockey matches were annotated manually using a custom annotation tool. The location data of players involved in the annotated bodychecks and randomly selected gameplay moments were gathered into a dataset. A random forest machine learning algorithm was trained on the dataset. The performance of the classifier was measured with receiver-operating characteristics and area under the curve metrics. These metrics were computed for cross-validation splits from the dataset and full matches that were used to create the dataset.
The trained classifier performs well in the light of the metrics. It reaches an average AUC of 0.995 on the validation splits during the training phase and 0.992 on the full match runs. The classifier produces a small amount of false positives relative to the number of all negative cases during the full match runs. However, the absolute number of false positives is still many times larger than the amount of actual bodychecks that were annotated in the matches. The final system as such does not achieve sufficient performance to be used in a production environment. Typical false positives are situations where the players are contesting the puck and are in close contact. The outcome of this thesis is that the objectives have been met and the purpose has been fulfilled. The number of false positives can be lowered by further developing the methods presented in this thesis. The performance of the learning system can be improved even without adding any new data sources. The attributes that were extracted from the location data are not ideal. For example, the representation only accounts for two players and ignores all other players on the ice. Other development directions could be to supplement the location data with acceleration data. Acceleration data would provide information about the impact forces that are present during bodychecks. Another option is to capture video footage of the detected bodychecks and analyze the footage with computer vision.
The trained classifier performs well in the light of the metrics. It reaches an average AUC of 0.995 on the validation splits during the training phase and 0.992 on the full match runs. The classifier produces a small amount of false positives relative to the number of all negative cases during the full match runs. However, the absolute number of false positives is still many times larger than the amount of actual bodychecks that were annotated in the matches. The final system as such does not achieve sufficient performance to be used in a production environment. Typical false positives are situations where the players are contesting the puck and are in close contact. The outcome of this thesis is that the objectives have been met and the purpose has been fulfilled. The number of false positives can be lowered by further developing the methods presented in this thesis. The performance of the learning system can be improved even without adding any new data sources. The attributes that were extracted from the location data are not ideal. For example, the representation only accounts for two players and ignores all other players on the ice. Other development directions could be to supplement the location data with acceleration data. Acceleration data would provide information about the impact forces that are present during bodychecks. Another option is to capture video footage of the detected bodychecks and analyze the footage with computer vision.