Activities

Date: Friday November 17 at 4 pm

Injury risk for NBA players: a survival analysis study


Injuries commonly occur in sports, and their prevention is important for many reasons. Indeed, injuries have economic implications for teams and may psychologically impact athletes. The survival data analysis framework is considered to evaluate the NBA players' injury risk through the use of statistical models for recurrent events. To this extent a unique data set created with a non-trivial harmonization and merging of several data sources (data about the injuries, play-by-play data and a dataset about players’ information) has been considered. All the injuries occurred from the 2010-2011 season to the 2019-2020 season have been examined.

Two separate analyses have been carried out with two different aims:

Results suggest that the role of players and the BMI have an important effect on the risk of injury. Moreover, a new variable measuring the weakness of each player has been evaluated to take into account the number of injuries occurred within a short period of time.

The talk was given by Ambra Macis from the University of Brescia, Department of Economics and Management.

Date: Friday September 29 at 3 pm CET

Phylourny: efficiently calculating elimination tournament win probabilities via phylogenetic methods

When predicting the outcome of knockout tournaments, such as those typical in football tournaments, the traditional method is to fit a model using historical data, and then use that model to simulate a tournament using a Monte Carlo method to obtain predicted win probabilities. However, using lessons and algorithms from computational phylogenetics, we can instead compute the win probabilities for a knockout tournament exactly for some specific model while also computing the results significantly faster (2 - 3 orders of magnitude). We implemented these techniques into a tool called Phylourny.


In addition, we use Phylourny to apply further techniques from computational phylogenetics. Specifically, we explore the parameter space for a given model using a Monte Carlo Markov Chain and summarize the stability of results for those model parameters. We show, by example with the several tournaments, that exploring the parameter space produces a more robust predictions, as well as characterizes the confidence of the prediction given.

The talk was given by Ben Bettisworth from the Heidelberg Institute for Theoretical Studies.

Date: Friday March 24 at 3 pm CET

Extensions of the Dixon and Coles model

In 1997 Dixon and Coles proposed a model for football by extending the Double Poisson model. The model added some kind of dependence to the known Double Poisson model but also allowed to transfer some probability to particular scores. The range of correlation provided by the model is limited but sufficient for some championships. Since then the model has found wide applicability. In this talk I will present my work with colleagues that extend the idea of the model in different directions. The key idea is that the Dixon and Coles model can be seen as a special cases of some known family of distributions and hence its extension can be put simply into this framework. I will also present extensions to several directions as well as few applications of the new extended model.  Properties of the extended model will be also discussed to better understand the new model.

The talk was given by Dimitris Karlis from the Athens University of Economics and Business.

Date: Friday February 24th at 4 pm

Do sports problems require tailored methods or direct applications?  On a swimming-oriented journey with a Bayesian roadmap

When looking at a new problem for the first time, the initial question that comes to mind can generally be summarised as: has this problem, or a closely-related one, already been tackled before? Although being a rather recent scientific field of study, sports sciences often face the same pattern, despite the number of recognised 'resolved problems' remaining relatively low compared to more established disciplines. In particular, when it comes to statistical tools, there exists in the literature an abundance of methods to handle a variety of problems, such as image classification, missing data reconstruction, time series forecasting, dimensionality reduction, and data visualisation, among others. In this presentation, I will try to illustrate how, in the past 5-years, determining whether my current sport-related problem required developing a novel tailored statistical tool or simply applying an established method on this particular dataset, was generally the most important step of the project to result in an adequate answer. From a few articles using direct applications of Bayesian mixed models on morphological swimming datasets to the 3 years-long development of a novel machine learning framework specifically dedicated to tackling the problem of forecasting irregular time series of swimmers' performances follow-up, let us explore through these examples the variety of challenges, with their intrinsic complexity, coming from sports sciences applications. In conclusion, I will try to emphasise how those new statistical models, originally tailored to handle sports-related problems, can often find natural applications in other fields like medicine, biology, or robotics (among others), thus contributing to the common methodological toolbox shared by many applicative sciences.

The talk will be given by Arthur Leroy from the University of Manchester.

Date: Friday October 21st at 3 pm

Analysis of basketball team performance using Bayesian longitudinal hidden Markov models

In this talk we first expound an overview of Bayesian longitudinal models, indicating their main properties, in order to introduce a Bayesian longitudinal Hidden Markov Model (HMM) which is applied in sport data.

To begin, Bayesian inference accounts for uncertainty in terms of probability distributions and uses Bayes’ theorem to update all relevant information. In addition, Bayesian statistics simplifies the implementation and interpretation of mixed effects models, very common in longitudinal modelling. Longitudinal data could be shortly defined as measurements of the same individual repeatedly taken over time. They include observations between and within individuals that allow the assessment of general patterns of the target population as well as specific individual characteristics.

Finally, sports data analytics is a relevant topic in applied statistics and its importance is increasing in recent years. We propose a Bayesian HMM that analyses the hot hand phenomenon in consecutive basketball shots corresponding to the Miami Heat team in the season 2005-2006 of the National Basketball Association (NBA). We show that this Bayesian HMM model can be a powerful tool to assess the ‘steakiness’ because it provides relevant information about the team's performance not only over the course of a match but also over the whole competition season.

The talk will be given by Carmen Armero and Gabriel Calvo Bayarri, both from University of Valencia.

Date: Friday September 30th at 3 pm

Modelling and prediction in recurrent time-to-event sports injury data: a penalized Cox regression approach

Sports injuries are complex phenomena that are a result of the dynamic interaction of multiple risk factors and have serious consequences on athletes' health. Recently, statistical models are given special attention to the study of sports injuries to gain an in-depth understanding of its risk factors and mechanisms. In this talk, we evaluate statistical modelling strategies and methods based on the Cox regression model for high-dimensional data and recurrent injury data. Predictive performance is also studied via simulations. A real case study of injuries of female football players of a Spanish football club.

The talk will be given by Dae-Jin Lee from BCAM - Basque Center for Applied Mathematics.

Date: Friday June 3rd at 11 am (!!)

Why FIFA should re-draw the 2022 World Cup

The draw for the 2022 FIFA World Cup takes place in April 2022 when three winners of the play-offs remain unknown. Seeding is based on the FIFA World Ranking released on 31 March 2022 and these three teams are drawn from the weakest Pot 4. We show that the official seeding policy does not balance the difficulty levels of the groups to the extent possible: a better alternative would be to assign the placeholders according to the highest-ranked potential winner, similar to the rule used in the UEFA Champions League qualification. The questionable decision of FIFA has harmed certain nations. In particular, Ukraine should play against stronger opponents if it manages to qualify for the World Cup after the Russian invasion has forced a rescheduling of its match(es). In the spirit of fairness and solidarity, FIFA is strongly encouraged to repeat the group stage draw following our proposal.

The talk will be given by Laszlo Csato  from Corvinus University.


Cycling Analytics: Present and Future Developments

Analytics has only recently entered the professional cycling industry with predictive modeling applied for talent scouting as well as for race outcome prediction. We will discuss these developments as well as some interesting avenues for future research.

The talk will be given by Bram Janssens from Ghent University.

Date: Friday May 20th at 11 am  (!!)

The impact of COVID on athletes

Although we had all hoped that the Corona pandemic would have come to an end by now, unfortunately it continues to plague society. The impact of COVID on sports and certainly on sports practitioners (athletes) remains large. In this training we will therefore discuss the impact of a SARS-CoV-2 infection on the various systems in the body (cardiovascular, muscular, etc) and both the short- and long-term effects will be explained. This training will be based on a literature overview and our own research results in this area will be presented as well.

The talk will be given by Evi Wezenbeek from Ghent University.

Date: Friday April 8th at 4 pm

How to conclude a suspended sports league?

Professional sports leagues may be suspended due to various reasons such as the recent COVID-19 pandemic. A critical question the league must address when re-opening is how to appropriately select a subset of the remaining games to conclude the season in a shortened time frame. Despite the rich literature on scheduling an entire season starting from a blank slate, concluding an existing season is quite different. Our approach attempts to achieve team rankings similar to that which would have resulted had the season been played out in full. We propose a data-driven model which exploits predictive and prescriptive analytics to produce a schedule for the remainder of the season comprised of a subset of originally-scheduled games. Our model introduces novel rankings-based objectives within a stochastic optimization model, whose parameters are first estimated using a predictive model. We present simulation-based numerical experiments from previous National Basketball Association (NBA) seasons 2004-2019, and show that our models are computationally efficient, outperform a greedy benchmark that approximates a non-rankings-based scheduling policy, and produce interpretable results. Our data-driven decision-making framework may be used to produce a shortened season with 25-50 % fewer games while still producing an end-of-season ranking similar to that of the full season, had it been played.

The talk will be given by  Ali Hassanzadeh from the University of Manchester.

Date: Friday March 25th at 3.30 pm

Exploring entertainment utility from football games

The uncertainty of outcome hypothesis, a crucial aspect of sport economics since the very first seminal contributions, remains contentious. Investigations using pre-match betting odds tend to produce little support for the hypothesis. More recently, following the work of Ely et al. (2015) attention has switched somewhat to identifying the entertainment utility from suspense and surprise. Two recent studies, Bizzozero et al. (2016) and Buraimo et al. (2020), using in-play data on in-match events and television viewership, provide some initial support for uncertainty-of-outcome-related preferences. They are, however, unable to shed any light on fan sentiment, or indeed to identify fan reactions. However, it seems likely that fans of the contestants participating in a contest will react differently to neutral viewers, and hence understanding their particular reaction to particular in-play stimuli like suspense, surprise and shock, is important. In this paper we use in-play betting prices, information on in-play match events, and social media information from fans of English Premier League soccer teams to firstly identify how many of those participating in a football match event are fans of the two teams involved, and subsequently assess how they reacted to the suspense, but also the shock and surprise from their perspective as fans. Preliminary results suggest that surprise appears to be the driving factor in social media activity.

This talk will be given by James Reade from the University of Reading.

Date: Friday January 28th at 4 pm

Quality control tools for probabilistic forecasts of football match results

In football match forecasting, much attention has been devoted to statistical approaches of modelling the outcomes either as Win, Draw & Loss (WDL) categories, or as score lines. The second step lies in the evaluation of such forecasts. At first, emphasis is about computing synthetic criteria such as MSE and more generally probability scoring rules as introduced and popularized by weather forecasters. We can go further by looking at elementary components of forecast quality such as Reliability, Resolution and Discrimination both numerically and graphically. This presentation briefly reviews this validation step both theoretically and practically in the light of standard predictions of outcomes for the UEFA Champions League (C1).

Win forecasts of the C1 group stage matches derived from a simple Poisson regression model and Bookmaker odds derived probabilities are well calibrated with good resolution and discrimination properties for both Home and Away outcomes. On the contrary, forecasts of draws with these procedures highly lack refinement, resolution, and discrimination.

This talk will be given by Jean-Louis Foulley from the Université de Montpellier, affiliated with IMAG.

Date: Friday December 10th at 4 pm

Motion Analysis in Sport

Motion is an integral and essential part of life. Motion in sport is complex and requires extreme precision in milliseconds. Exercise and sport increases strength, endurance and flexibility in all ages and genders. Physiological and pathological motion can be evaluated to prevent injury and increase performance. Motion analysis in sports can furthermore define game strategy and tactics, evaluate team organization capacity, can give feed-back to players and athletes, and assess efficiency of training. Also named as sports biomechanics, motion analysis in sports is an interdisciplinary science that quantifies joint angles, ground reaction forces and skeletal muscle activity in a calibrated space using sophisticated high-speed cameras, force plates and electromyography (EMG), respectively. After data acquisition, an advanced computer and software calculates physiological and pathological joint moments and powers. Kinetics, kinematics and skeletal muscle activity by EMG can be quantified in the lab and field. Muscle strength and proprioception is correlated with motion analysis. In this presentation, lab and field experience in various sport biomechanical studies will be shared.

The talk will be given by Prof. Feza Korkusuz from the Hacettepe University.

Date: Friday November 19th at 4 pm

Schedule-adjusted league tables during the football season

In this talk I will show how to construct a better football league table than the official ranking based on accumulated points to date.  The aim of this work is (only) to produce a more informative representation of how teams currently stand, based on their match results to date in the current season; it is emphatically not about prediction.  A more informative league table is one that takes proper account of "schedule strength" differences, i.e., differing numbers of matches played by each team (home and away), and differing current standings of the opponents that each team has faced.

This work extends previous "retrodictive" use of Bradley-Terry models and their generalizations, specifically to handle 3 points for a win, and also to incorporate home/away effects coherently without assuming homogeneity across teams.  Playing records that are 100% or 0%, which can be problematic in standard Bradley-Terry approaches, are incorporated in a simple way without the need for a regularizing penalty on the likelihood.  A maximum-entropy argument shows how the method developed here is the mathematically "best" way to account for schedule strength in a football league table.

Illustrations will be from the English Premier League.

The talk will be given by David Firth from the University of Warwick.

DOUBLE EVENT! Date: Friday October 15th at 4 pm

Chaos in motion: the application and interpretation of entropy in biosignal analysis

The number of entropy statistics and studies employing them in the analysis of physiological time series has increased substantially over the last decade. In the context of sport science, the application of entropy remains relatively under-exploited despite the potential for entropy measures to quantify the complex nonlinear behaviour present in many real-world systems. Estimation of entropy typically involves several embedding parameters whose selection is critical in order to accurately quantify the underlying dynamics of the system under observation, with newer methods seeking to remove extraneous parameters altogether. However, the appropriate choice of embedding parameters has generally been overlooked, often leading researchers to infer a distorted meaning from entropy values and potentially draw misleading conclusions as a result.

This talk will describe entropy in the context of time series analysis, present an overview of the many entropy methods established in the scientific literature, and discuss the appropriate estimation of such measures for quantifying nonlinear behaviour in sport science applications with reference to a newly developed software toolkit called EntropyHub.

The talk will be given by Matthew Flood from the Luxembourg Institute of Health.

Application of Posterior predictive distribution estimators under the presence of additional information

In this talk, we present the application of the posterior predictive distribution when there exist some other sources of information available. To illustrate how well the proposed distribution estimators perform, we provide both a simulated study and real examples, by analyzing datasets of the National Hockey League (NHL), as well as soccer games.

The talk will be given by Abdolnasser Sadeghkani from the Instituto Tecnologico Autonomo de Mexico.

Date: Friday September 10th at 4 pm

A Hybrid Machine Learning Approach for the Modeling and Prediction of the UEFA EURO2020

Conventional   approaches   that   analyze   and   predict   the results of international matches in football are mostly based on the framework of Generalized Linear Models. The most frequently used type of regression models in the literature is the Poisson model. It has been shown that the predictive performance of such models can be improved by combining them with different regularization methods such as penalization.

More recently, also methods from the machine learning field such as boosting and  random forests turned   out   to   be   very   powerful   in   the   prediction   football   match outcomes. Here, we analyze both a hybrid random forest extension based on conditional inference trees and a hybrid boosting extension based on  extreme gradient boosting  for modeling football matches. The models are fitted to match data from previous UEFA  European Championships  (EUROs) and based on the corresponding estimates all match outcomes of the EURO 2020 are repeatedly simulated (100,000 times), resulting in winning probabilities for all participating national teams.

The talk will be given by Andreas Groll from TU Dortmund University.

Date: Friday June 11th at 4 pm

To predict or protect; the value of screening in sports practice

Recently an academic discussion has evoked around the message that screening does not predict which athlete will sustain an injury. In various media the clinical implications of this conclusion are interpreted to mean that screening is useless in sports medical practice. However, screening remains essential in our efforts to protect athletes’ health. Indeed, screening may not predict which athlete will sustain an injury, but this does not directly make screening a useless tool in our tool bag. To extend what has been a robust discussion, this keynote will go in-depth into the concept of screening and explain some of the definitional issues that appear to fuel the heated debate. This address offers potential reasons why and how individual screening tests lack clinical utility, as well as what is needed to improve the value and efficiency of our instruments. This all leads up to the argument that screening can be important for to protect the health of an individual athlete, given some ground rules are needed and limitations exist.

The talk will be given by Evert Verhagen from the University of Amsterdam.

Date: Friday May 21st at 4 pm

Introducing Regularisation to Generalised Joint Regression Modelling and its Application to Football and Sports

When modelling the bivariate outcome of football matches and other sports, many different approaches regarding dependency have been investigated. We propose the use of copula regression via the powerful GJRM (Generalised Joint Regression Models) framework in R by Giampiero Marra and Rosalba Radice and present its use for modelling match results. Motivated by the application to football and FIFA World Cups in particular, we introduce two types of useful penalties. The first tackles a very specific issue occurring in sport tournaments and leagues (or other competitive situations), while the second is a Lasso-approximation yielding general sparsity.

The talk will be given by Hendrik van der Wurp from the Technical University of Dortmund.



Date: Friday April 23rd at 3.30 pm

Enduring Love: the Long-Term Effect of a New Stadium on Attendance at Professional English Soccer

Since 1988, 22 of the 92 of the english professional football clubs have acquired a new stadium. We estimate the causal effect of the new facility on attendance using a difference-in-difference model. We find that a new stadium raises attendance by around 20% on average, and this effect is sustained over nearly two decades. This result contrasts with the "Honeymoon effect" identified in the US literature, where attendance quickly reverts to the mean. We explain this by the promotion and relegation system. Unlike the closed major leagues, a new stadium represents an opportunity for a club in a lower division to generate increased revenue from fans, hire better players, win promotion and sustain a higher league position on average.

The talk will be given by Stefan Szymanski from the University of Michigan. The organization of this webinar is handled by James Reade from the University of Reading.



Date: Friday March 12th at 4 pm (or 3.30 pm)

Machine Learning in Orthopaedic Sports Medicine - Clinical Translation from the Registry to the Clinic

While the clinical application of machine learning to several health-care disciplines has increased considerably over recent years, it remains in its infancy in orthopaedic sports medicine. This presentation will review the basics of machine learning, give examples of how it has impacted other areas of orthopaedic surgery, and illustrate how machine learning can be applied to existing national knee ligament registers. Completed and ongoing collaborative studies between the University of Minnesota and the University of Oslo will be discussed, including a demonstration of how an in-clinic calculator was developed capable of estimating the risk of ACL reconstruction failure at a patient-specific level. Focusing on clinical translation of machine learning techniques, future opportunities within sports medicine will also be presented to stimulate idea generation and ultimately improve patient care. 

R. Kyle Martin MD FRCSC is an orthopaedic sports medicine surgeon with the University of Minnesota. Originally from Canada, Dr. Martin completed his orthopaedic training at the University of Manitoba in 2017. He then travelled to Oslo, Norway where he spent one-year in a clinical fellowship with professor Lars Engebretsen at the University of Oslo and the Oslo Sports Trauma Research Center. Following the Oslo fellowship, Dr. Martin then completed a second clinical fellowship at Mayo Clinic in Rochester, MN. His current clinical practice revolves around knee, hip, and shoulder injuries and arthroscopy, while his research focus is on machine learning and its clinical applications to the field of sports medicine. 



Date: Friday February 26th at 3.30 pm

Real-Time Skeleton Detection for Visual Sports Analysis and ... You

Dr. Manuel Stein presents a system for automated data acquisition and analysis from simple video recordings of team sport matches. The proposed system focuses on extracting movement data as well as body poses for players and allows tracking of ball movement in the case of ball-based forms of team sports. Furthermore, Dr. Manuel Stein will provide insights into his novel system for automatically displaying complex and advanced 2.5D visualizations superimposed on the original video recordings. As CEO & Co-Founder of Subsequent, he is especially interested in new project ideas and would like to hear about how you would like to make use of such a system.



Date: Friday January 22nd at 4pm

Shoe cushioning, body mass and running biomechanics as risk factors for running injury: A randomised trial with 800+ recreational runners


The objective of the event is to present a large randomised trial on the relationship between shoecushioning, running biomechanics and the risk of running-related injury to researchers interested ininjury aetiology, injury prediction and data sciences. The second part of the webinar will specificallyfocus on the challenges and opportunities resulting from the large dataset collected in this project, aswell as on recommendations for future research in injury prevention.


a. Effect of shoe cushioning and body mass on injury risk[3]

b. Effect of cushioning on running biomechanics[4]

a. Predicting cumulative load using a wearable device[5] Anne Backes (LIH)

b. Predicting running-related injury using machine learning – Hans van Eetvelde (UGent)


[1]Theisen D, Nielsen R, Malisoux L. The relationship between running shoes and running injuries: Choosing between a complicated truth and a simple lie. In: Ley C, Dominicy Y, editors. Science meets sports: When statistics are more than numbers. 1 ed: Cambridge Scholars Publishing; 2020. p. 123-146.[2]Malisoux L, Delattre N, Urhausen A, Theisen D. Shoe cushioning, body mass and running biomechanics as risk factors for running injury: a study protocol for a randomised controlled trial. BMJ Open 2017; 7(8):e017379.[3]Malisoux L, Delattre N, Urhausen A, Theisen D. Shoe Cushioning Influences the Running Injury Risk According to Body Mass: A Randomized Controlled Trial Involving 848 Recreational Runners. Am J Sports Med2020; 48(2):473-480.[4]Malisoux L, Delattre N, Meyer C, Gette P, Urhausen A, Theisen D. Effect of shoe cushioning on landing impact forces and spatiotemporal parameters during running: results from a randomized trial including 800+ recreational runners. Eur J Sport Sci 2020;10.1080/17461391.2020.1809713:1-9.[5]Backes A, Skejø SD, Gette P, Nielsen RØ, Sørensen H, Morio C, et al. Predicting cumulative load during running using field based measures. ‐Scandinavian Journal of Medicine & Science in Sports2020;10.1111/sms.13796.

Date: Friday December 11th at 4pm

Statistical concept of CUB models to the world of sports

16:00 The class of CUB models: a paradigm for rating data (by Domenico Piccolo)

16:10 CUB models and extensions: from theory to action (by Rosaria Simone)

16:40: Focus on two developments: Nonlinear CUB and Treatment of "don't know" responses (by Marica Manisera)

16:55: Future research on CUB models in Sports: some insights (by Paola Zuccolotto)

If interested in joining the session, please contact Christophe Ley at Christophe.Ley@UGent.be


Date: Friday December 4th at 2.10pm 

Prevention of injuries, are we heading the right direction?

Physical activity and sports are an integral part of our society. Both have a positive effect on quality of life. However, it should at the same time be noted that due to the physical demands the injury rate/percentage in sports are high[1]. Besides the consequences for the player, there are repercussions for the team and club. Injuries do not only lead to reduced performance,  they   cause  financial losses  as   well[2].   In  order  to   minimize  these   negative consequences, prevention programs to predict injuries were developed[3]. However, injuries show no tendency to decrease[4]. As such, it can be  concluded that injury prevention is currently   not   sufficiently   adequate   and   in  need   for   change[5].   It   is   know   that   current approaches   are   not   sufficiently   addressing   the  complex   and   dynamic   nature   of   sports   injury aetiology, and this necessitates the need for integrating complex system approaches in sports injury prediction and prevention[6]. Accordingly, due to the lack of suitable methodological approaches, the use of Artificial intelligence to identify complex patterns of interactions and the implementation of wearables  to continuously monitor the athletes have been introduced[6,7]. Despite the promising future for the use of AI and the implementation of wearables, further research, based on longitudinal studies with large datasets and continuous monitoring, is warranted to establish the effectiveness and predictive performance of these statistical techniques and methods in the particular domain of sports injury risk identification.

Speaker: Evi Wezenbeek (Ghent University) 


[1] Pfirrmann D, et al., Analysis of Injury Incidences in Male Professional Adult and Elite Youth Soccer Players.[2] Hägglund M, et al., Injuries affect team performance negatively in professional football.[3] Owen A, et al., Effect of an injury prevention program on muscle injuries in elite professional soccer.[4] Ekstrand J, et al., Hamstring injuries have increased by 4% annually in men’s professional football, since 2001.[5] Van Dyk N. et al., Prevention forecast: cloudy with a chance of injury.[6] Bittencourt et al., Complex systems approach for sports injuries.[7] Claudino et al., Current Approaches to the Use of Artificial Intelligence for Injury Risk Assessment and Performance Prediction in Team Sports.