Fitting football models and visualizing predictions with the footBayes package

Leonardo Egidi, Roberto Macrì Demartino, and Vasilis Palaskas

2025-05-15

Modeling football outcomes

Modeling football outcomes became incredibly popular over the last years. However, an encompassing computational tool able to fit in one step many alternative football models is missing yet.

With the footBayes package we want to fill the gap and to give the possibility to fit, interpret and graphically explore the following goal-based Bayesian football models using the underlying Stan (Stan Development Team 2020) environment using the CmdStanR (Gabry et al. 2024) interface:

While these models offer robust frameworks for modeling football match outcomes, their predictive performance can be further improved by incorporating additional historical information about team strengths. To this aim, the Bradley-Terry-Davidson (BTD) model (Davidson 1970) is an interesting method for deriving historical team strengths based on match outcomes. The footBayes package implements a Bayesian version of the Bradley-Terry-Davidson model, as detailed in Macrı̀ Demartino, Egidi, and Torelli (2024). The resulting relative log-strengths can then be used as an additional covariate in the goal-based models.

Precisely, we’ll learn how to:

The package is also available at the following link:

https://github.com/LeoEgidi/footBayes

In my opinion building packages is like an art’s work, likely to be never-ending. To say in art’s terms, I love this quote from Antoine de Saint-Exupéry (partially rephrased):

“A (software) designer knows that he has reached perfection not when there is nothing more to add, but when there is nothing left to take away!”

Bradley-Terry-Davidson model

The Bradley-Terry model (Bradley and Terry 1952) is one of the most popular modelling techniques in a pairwise comparison context for ranking players or teams. The model assumes that each team \(T_k\), with \(k = 1, \ldots, N_T\), is characterized by a latent parameter, \(\alpha_k > 0\), representing its intrinsic strength. The odds of winning a match are then determined by the ratio of these parameters. Specifically, for a match between team \(T_i\) and team \(T_j\), with \(i\neq j = 1, \ldots, N_T\) , the probability that \(T_i\) defeats \(T_j\) in the -th match, with \(n = 1, \ldots, N\), is \[\begin{align} p_{ij}^W = \mathbb{P}(T_{i}\; \text{defeats} \; T_{j}) = \dfrac{\alpha_i}{\alpha_i+\alpha_j}, \label{basic_BT} \end{align}\] where \(\alpha_{i}\) and \(\alpha_{j}\) are the strength parameters of the teams involved in the match. Parameter identifiability is obtained by imposing a constraint such as \(\sum_{k=1}^{N_T} \alpha_k=1\).

The model is commonly reparameterized by the logarithm of the strength parameters \[\begin{align} p_{ij}^W = \dfrac{\exp(\psi_i)}{\exp(\psi_i)+\exp(\psi_j)}, \end{align}\] where \(\psi_i = \log(\alpha_i)\) and \(\psi_j = \log(\alpha_j)\). Consequently, the parameters are identifiable if \(\sum_{k=1}^{N_T} \psi_k=0\).

The standard Bradley-Terry model does not account for draws. To address this limitation, several alternatives have been proposed. Davidson (1970) introduced an additional parameter, \(\zeta\), that balances the probability of ties against the probability of not having ties, thereby computing three distinct probabilities. The log-parametrization of the BTD model is \[\begin{align} p_{ij}^W &= \dfrac{\exp(\psi_i)}{\exp(\psi_i)+\exp(\psi_j)+\exp(\nu+ (\psi_i+\psi_j)/2)},\\ p_{ij}^D &= \dfrac{\exp(\nu+(\psi_i+\psi_j)/2)}{\exp(\psi_i)+\exp(\psi_j)+\exp(\nu+ (\psi_i+\psi_j)/2)},\\ p_{ij}^L &= \dfrac{\exp(\psi_j)}{\exp(\psi_i)+\exp(\psi_j)+\exp(\nu+ (\psi_i+\psi_j)/2)}, \end{align}\] where \(\nu = \log(\zeta)\). Since the three-way process events are mutually exclusive, the following constraint is imposed \(p_{ij}^W + p_{ij}^D + p_{ij}^L = 1.\)

The Bayesian BTD model requires specifying prior distributions for both the team log-strength parameters and the tie parameter. Specifically, the prior placed on \(\nu\) reflects our initial belief about the impact of teams’ strengths on tie outcomes Issa Mattos and Martins Silva Ramos (2022) . Let \(w_{ij}\) represent the binary outcome where team \(T_i\) defeats team \(T_j\), and let \(d_{ij}\) indicate the binary outcome of a draw between teams \(T_i\) and \(T_j\). Then, the hierarchical Bayesian BTD model is \[\begin{align} w_{ij} \mid p_{ij}^W &\sim \mathrm{Bernoulli}(p_{ij}^W),\\ d_{ij} \mid p_{ij}^D &\sim \mathrm{Bernoulli}(p_{ij}^D),\\ \psi &\sim \mathrm{N}(\mu_\psi,\sigma^2_\psi),\\ \nu &\sim \mathrm{N}(\mu_\nu,\sigma^2_\nu), \end{align}\] where \(\mu_\psi\) and \(\mu_\nu\) are the mean for the team log-strength and log-tie parameters, and \(\sigma^2_\psi\) and \(\sigma^2_\nu\) denote the corresponding variances. Specifically, identical independent Gaussian distributions for each log-strength parameter and for the log-tie parameter, satisfies all the four conditions specifically suited for ranking system proposed by Whelan (2017) .

Bivariate-Poisson model

One main concern with the double Poisson model relies on the fact that the goals scored during a match by two competing teams are conditionally independent. However, in team sports, such as football, water-polo, handball, hockey, and basketball it is reasonable to assume that the two outcome variables are correlated since the two teams interact during the game. Consider, for instance, the realistic football case of the home team leading with 1-0, when only ten minutes are left to play. The away team can then become more determined and can take more risk in an effort to score and achieve the draw within the end of the match. Or, even when one of the two teams is leading say with 3-0, or 4-0, it is likely it will be relaxing a bit, and the opposing team could score at least one goal quite easily. To this aim, goals’ correlation due to a change in the behaviour of the team or both teams could be captured by a dependence parameter, accounting for positive correlation. Positive parametric goals’ dependence is made possible by using a bivariate Poisson distribution.

Consider random variables \(X_r, r =1,2,3\), which follow independent Poisson distributions with parameters \(\lambda_r >0\). Then the random variables \(X=X_1 + X_3\) and \(Y =X_2 + X_3\) jointly follow a bivariate Poisson distribution \(\text{BP}(\lambda_1, \lambda_2, \lambda_3)\), with joint probability function

\[\begin{align} \begin{split} P_{X, Y}(x,y)&= \textrm{Pr}(X=x, Y=y)\\ &= \exp \{ -(\lambda_1+\lambda_2+\lambda_3) \}\frac{\lambda_1^x}{x!}\frac{\lambda_2^y}{y!}\times \\ & \sum_{k=0}^{\min{(x,y)}}\binom{x}{k}\binom{y}{k}k!\left( \frac{\lambda_3}{\lambda_1 \lambda_2} \right)^k. \end{split} \label{eq:biv_density} \end{align}\]

Marginally each random variable follows a Poisson distribution with \(\textrm{E}(X)=\lambda_{1}+\lambda_{3}, \ \textrm{E}(Y)=\lambda_{2}+\lambda_{3}\), and \(\textrm{cov}(X,Y)=\lambda_{3}\); \(\lambda_{3}\) acts as a measure of dependence between the goals scored by the two competing teams. If \(\lambda_3 = 0\) then the two variables are conditionally independent and the bivariate Poisson distribution reduces to the product of two independent Poisson distributions, the double Poisson case.

Let \((x_{n}, y_{n})\) denote the observed number of goals scored by the home and the away team in the \(n\)-th game, respectively. A general bivariate Poisson model allowing for goals’ correlation, as in Karlis and Ntzoufras (2003) is the following:

\[\begin{align} X_n, Y_n| \lambda_{1n}, \lambda_{2n}, \lambda_{3n} & \sim \mathsf{BivPoisson}(\lambda_{1n}, \lambda_{2n}, \lambda_{3n})\\ \log(\lambda_{1n}) & = \mu+\text{home}+ \text{att}_{h_n}+\text{def}_{a_n} + \frac{\gamma}{2}\omega_n\\ \log(\lambda_{2n}) & = \mu +\text{att}_{a_n}+\text{def}_{h_n} - \frac{\gamma}{2}\omega_n \\ \log(\lambda_{3n}) &=\beta_0 + \phi_1 \beta^{\text{home}}_{h_n}+\phi_2\beta^{\text{away}}_{a_n} + \phi_3\boldsymbol{\beta} w_n, \end{align}\]

where \(\lambda_{1n}, \lambda_{2n}\) represent the scoring rates for the home and the away team, respectively; \(\mu\) represents the constant intercept; \(\text{home}\) represents the home-effect, i.e. the well-known advantage of the team hosting the game; \(\text{att}_t\) and \(\text{def}_t\) represent the attack and the defence abilities, respectively, for each team \(t\), \(t=1,\ldots,T\); the nested indexes \(h_{n}, a_{n}=1,\ldots,T\) denote the home and the away team playing in the \(n\)-th game, respectively; \(\beta_0\) is a constant parameter; \(\beta^{\text{home}}_{h_n}\) and \(\beta^{\text{away}}_{a_n}\) are parameters that depend on the home and away team respectively, \(w_{n}\) is a vector of covariates for the \(n\)-th match used to model the covariance term and \(\boldsymbol{\beta}\) is the corresponding vector of regression coefficients. Furthermore, \(\omega_n = (rank\_points_{h_n}-rank\_points_{a_n})\) captures the difference in relative strengths (ranking points) between the home and away teams in the \(n\)-th match. Finally, the parameter \(\gamma\) tries to correct for the ranking points difference occurring between two competing teams. The parameters \(\phi_1, \phi_2\) and \(\phi_3\) are dummy binary indicators taking values 0 or 1 which may activate distinct sources of the linear predictor. Hence when \(\phi_1=\phi_2=\phi_3=0\) we consider constant covariance as in Egidi and Torelli (2020) , whereas when \((\phi_1, \phi_2, \phi_3)=(1,1,0)\) we assume that the covariance depends on the teams’ parameters only but not on further match covariates, and so on.

The case \(\lambda_{3n}=0\) (the scores’ correlation parameter equals zero) reduces to the double Poisson model, as in Baio and Blangiardo (2010) .

To achieve model’s identifiability, attack/defence parameters are imposed a sum-to-zero constraint:

\[\begin{equation} \sum_{t=1}^{T} \text{att}_{t}=0, \ \sum_{t=1}^{T}\text{def}_{t}=0. \label{eq:sum_to_zero} \end{equation}\]

Another identifiability constraint, largely proposed in the football literature, is the corner-constraint, which assumes the abilities for the \(T\)-th team are equal to the negative sum of the others, and then achieves a sum-to-zero as well:

\[\begin{equation} \text{att}_T= -\sum_{t=1}^{T-1} \text{att}_{t}, \ \text{def}_T=- \sum_{t=1}^{T-1}\text{def}_{t}. \label{eq:corner2} \end{equation}\]

The current version of the package allows for the fit of diagonal-inflated Bivariate Poisson Karlis and Ntzoufras (2003) and zero-inflated Skellam models to better capture the probability of draw occurrences. A draw between two teams is represented by the outcomes on the diagonal of the probability table. To correct for the excess of draws we may add an inflation component on the diagonal of the probability function. This model is an extension of the simple zero-inflated model that allows only for an excess in (0,0) draws. Let’s focus on the diagonal-inflated bivariate Poisson model, specified as:

\[\begin{equation} P_{X, Y}(x,y) = \textrm{Pr}(X=x, Y=y) = \begin{cases} (1-p) \text{BP}(\lambda_1, \lambda_2, \lambda_3) \ \ \ & \text{if} \ x \ne y \\ (1-p) \text{BP}(\lambda_1, \lambda_2, \lambda_3) + pD(x, \eta) \ \ \ & \text{if} \ x = y, \end{cases} \end{equation}\]

where \(D(x, \eta)\) is a discrete distribution with parameter vector \(\eta\).

Now it’s time to fit and interpret the models, and we’ll mainly focus on the bivariate Poisson case. Classical estimates for BP models are provided, among the others, by Karlis and Ntzoufras (2003) (MLE through an EM algorithm) and Koopman and Lit (2015); in the following, we quickly revise the maximum likelihood approach, but we’ll deeply focus on Bayesian estimation with the underlying CmdStan software to better capture:

Likelihood approach

Given the parameter-vector \(\boldsymbol{\theta}= (\{\text{att}_t,\text{def}_t,t=1,\ldots,T \}, \mu, \text{home}, \beta^{\text{home}}_{h_n}, \beta^{\text{away}}_{a_n}, \beta_0, \boldsymbol{\beta})\), the likelihood function of the bivariate Poisson model above takes the following form:

\[\begin{eqnarray} L(\boldsymbol{\theta}) = & \prod_{n=1}^N \exp \{ -(\lambda_{1n}+\lambda_{2n}+\lambda_{3n}) \}\frac{\lambda_{1n}^{x_n}}{{x_n}!}\frac{\lambda_{2n}^{y_n}}{{y_n}!}\times \\ &\sum_{k=0}^{\min{(x_n,y_n)}}\binom{x_n}{k}\binom{y_n}{k}k!\left( \frac{\lambda_{3n}}{\lambda_{1n} \lambda_{2n}} \right)^k. \label{eq:lik_biv} \end{eqnarray}\]

Maximum-likelihood parameters estimation can be performed by searching the MLE \(\hat{\boldsymbol{\theta}}\) such that:

\[\hat{\boldsymbol{\theta}} = \underset{\theta \in \Theta}{\text{argmax}}\ L(\boldsymbol{\theta}),\]

by imposing the following system of partial (log)-likelihood equations:

\[l'(\boldsymbol{\theta})=0.\] Wald and deviance-confidence intervals may be constructed for the MLE \(\hat{\boldsymbol{\theta}}\). A 95% Wald-type interval satisfies:

\[\hat{\boldsymbol{\theta}} \pm 1.96 \ \text{se}(\hat{\boldsymbol{\theta}}).\] As we’ll see, the footBayes package allows the MLE computational approach (along with Wald-type and profile-likelihood confidence intervals) for static models only, i.e. when the model complexity is considered acceptable. As the parameters’ space grows—as it commonly happens when adding dynamic patterns—MLE becomes computationally expensive and less reliable.

Bayesian approach

The goal of the Bayesian analysis is to carry out inferential conclusions from the joint posterior distribution \(\pi(\boldsymbol{\theta}|\mathcal{D})\), where \(\mathcal{D}= (x_n, y_n)_{n=1,\ldots,N}\) denotes the set of observed data for the \(N\) matches. The joint posterior satisfies

\[\pi(\boldsymbol{\theta}|\mathcal{D}) = \frac{p(\mathcal{D}|\boldsymbol{\theta})\pi(\boldsymbol{\theta})}{p(\mathcal{D})} \propto p(\mathcal{D}|\boldsymbol{\theta})\pi(\boldsymbol{\theta}),\]

where \(p(\mathcal{D}|\boldsymbol{\theta})\) is the model sampling distribution (proportional to the likelihood function), \(\pi(\boldsymbol{\theta})\) is the joint prior distribution for \(\boldsymbol{\theta}\), and \(p(\mathcal{D})= \int_{\Theta} p(\mathcal{D}|\boldsymbol{\theta})\pi(\boldsymbol{\theta}) d\theta\) is the marginal likelihood that does not depend on \(\theta\).

In terms of inferential conclusions, we are usually interested in summaries from the marginal posterior distributions of the single parameters: posterior means, medians, credibility intervals, etc.. We can write out the formula for the posterior distribution of the bivariate Poisson model above as:

\[\pi(\boldsymbol{\theta}|\mathcal{D}) \propto \pi(\boldsymbol{\theta}) \prod_{n=1}^N \mathsf{BivPoisson}(\lambda_{1n}, \lambda_{2n}, \lambda_{3n}),\]

where \(\pi(\boldsymbol{\theta})= \pi(\text{att}) \pi(\text{def})\pi(\mu)\pi(\text{home}) \pi(\beta^{\text{home}}_{h_n})\pi( \beta^{\text{away}}_{a_n})\pi(\beta_0)\pi(\boldsymbol{\beta}) \pi(\phi)\) is the joint ptior distribution under the assumption of a-priori independent parameters’ components.

The standard approach is to assign some weakly-informative prior distributions to the team-specific abilities. These parameters are considered exchangeable from two common (prior) distributions:

\[\begin{align} \text{att}_t &\sim \mathrm{N}(\mu_{\text{att}}, \sigma_{\text{att}})\\ \text{def}_t &\sim \mathrm{N}(\mu_{\text{def}}, \sigma_{\text{def}}), \ \ t= \ 1,\ldots,T, \end{align}\]

with hyperparameters \(\mu_{\text{att}}, \sigma_{\text{att}}, \mu_{\text{def}}, \sigma_{\text{def}}\). The model formulation is completed by assigning some weakly-informative priors to the remaining parameters. In what follows, some priors’ options will be handled directly by the user.

In the majority of the cases, \(\pi(\boldsymbol{\theta}|\mathcal{D})\) does not have a closed form and for such reason we need to approximate it by simulation. The most popular class of algorithms designed to achieve this task is named Markov Chain Monte Carlo Methods (see Robert and Casella (2013) for a deep theoretical overview). These methods allow us to generate weakly correlated samples from Markov chains whose stationary distributions coincide with the target posterior.

The footBayes package provides a suite of advanced algorithms for approximating posterior distributions. Each of these methods offers a different balance of computational efficiency and approximation accuracy, providing users with flexible tools for posterior analysis. Its features include:

Hamiltonian Monte Carlo (HMC)

The package leverages HMC via the Stan software for efficient MCMC sampling. Named after Hamiltonian dynamics in physics, HMC mitigates the random-walk behavior and inefficiencies often encountered with Gibbs sampling or the Metropolis-Hastings algorithm. For further details see Betancourt (2017) and the CmdStan users’s guide page.

Variational Inference Methods

footBayes supports two variational inference techniques:

  • Automatic Differentiation Variational Inference (ADVI):
    ADVI approximates the Evidence Lower Bound (ELBO) using Monte Carlo integration and optimizes it in the real-coordinate space via stochastic gradient ascent. For further details see Kucukelbir et al. (2017) and the CmdStan users’s guide page.

  • Pathfinder Algorithm:
    Starting from a random initialization, Pathfinder follows a quasi-Newton optimization path. It estimates the local covariance using negative inverse Hessian estimates from the LBFGS optimizer and returns draws from the Gaussian approximation that minimizes the Kullback-Leibler (KL) divergence to the true posterior. For further details see Zhang et al. (2022) and the CmdStan users’s guide page.

Laplace Approximation

This method operates in an unconstrained parameter space. For parameters that are originally constrained, the approximation is computed at the mode in the unconstrained space, and the resulting normal approximation is transformed back into the constrained space before output. For further details see its CmdStan users’s guide page.

Installing the package

Using footBayes package requires installing the R package cmdstanr (not available on CRAN) and the command-line interface to Stan: CmdStan. You may follow the instructions in Getting started with CmdStanR to install both.

You can install the released version of footBayes from CRAN with:

install.packages("footBayes", type = "source")

Please note that it is important to set type = "source". Otherwise, the ‘CmdStan’ models in the package may not be compiled during installation.

Alternatively to CRAN, you can install the development version from GitHub with:

library(devtools)
install_github("LeoEgidi/footBayes")

and load the following required packages (please, install them on your laptops):

library(footBayes)
library(loo)
library(ggplot2)
library(ggridges)
library(dplyr)
library(bayesplot)

Bradley-Terry-Davidson model fit

The btd_foot function fits a Bayesian Bradley–Terry–Davidson (BTD) model by approximating the posterior distribution using one of the methods described above (HMC, ADVI, Pathfinder, or Laplace approximation), and supports both static and dynamic ranking models to estimate team strengths over time.

Using btd_foot

In this section, we will walk through the steps to use the btd_foot function effectively, including preparing the data, choosing model options, and interpreting the results. Specifically the data must be in a specific format required by the btd_foot function. It should be a data frame with the following columns:

To this aim, we will use data from the Italian football league for the seasons 2020 and 2021.

library(dplyr)
library(footBayes)

data("italy")

italy_2020_2021 <- italy %>%
  dplyr::select(Season, home, visitor, hgoal, vgoal) %>%
  dplyr::filter(Season == "2020" | Season == "2021") %>%
  dplyr::mutate(match_outcome = dplyr::case_when(
    hgoal > vgoal ~ 1, # Home team wins
    hgoal == vgoal ~ 2, # Draw
    hgoal < vgoal ~ 3 # Away team wins
  )) %>%
  dplyr::mutate(periods = dplyr::case_when(
    dplyr::row_number() <= 190 ~ 1,
    dplyr::row_number() <= 380 ~ 2,
    dplyr::row_number() <= 570 ~ 3,
    TRUE ~ 4
  )) %>% # Assign periods based on match number
  dplyr::select(periods,
    home_team = home,
    away_team = visitor, match_outcome
  )

One of the most important features implemented in the btd_foot function is the possibility to choose between assuming that team strengths are constant over time or allowing team strengths to change across different time periods. Specifically, following the approach proposed by Owen (2011) , Egidi, Pauli, and Torelli (2018), and Macrı̀ Demartino, Egidi, and Torelli (2024) for the attack and defense abilities in goal-based models, we can assume that the log-strength parameters vary across a given number of weeks or seasons \(1, \ldots, \mathcal{T}\). To capture these dynamics, we assign the priors for the log-strength parameters to be auto-regressive of order 1 for each time period \(\tau=2,\ldots, \mathcal{T}\):

\[\begin{align} \psi_{t,\tau} \sim \mathrm{N}(\mu_{\psi_{t,\tau-1}}, \sigma_\psi) \end{align}\]

whereas for \(\tau = 1\): \[\begin{align} \psi_{t,1} &\sim \mathrm{N}(\mu_\psi,\sigma_\psi), \end{align}\]

with \(\mu_\psi\) being the initial mean of the log-strength parameter and \(\sigma_\psi\) the standard deviation for the log-strength parameter, assumed to be constant across time and teams. Accordingly, the user can use the dynamic_rank argument:

The Bayesian BTD model provides posterior distributions for the log-tie and log-strength parameters, reflecting the inherent uncertainty in the ranking system. To develop the final ranking of teams, it is essential to summarize these posterior distributions. The rank_measure argument determines how the team strengths are summarized:

# Dynamic Ranking Example with Median Rank Measure
fit_result_dyn <- btd_foot(
  data = italy_2020_2021,
  dynamic_rank = TRUE,
  rank_measure = "median",
  iter_sampling = 1000,
  # parallel_chains = 2,
  chains = 2,
  adapt_delta = 0.9,
  max_treedepth = 12
)

The btd_foot output is an object of class btdFoot that includes a custom print function providing detailed posterior summaries of the model parameters. This function allows users to specify the parameters and teams to display via the pars and teams arguments, respectively, set the number of digits through the digits argument, and determine whether to show rankings, parameters, or both using the display argument.

print(fit_result_dyn,
  display = "parameters",
  pars = c("logStrength", "logTie"),
  teams = c("AC Milan", "AS Roma")
)
#> Bayesian Bradley-Terry-Davidson model
#> ------------------------------------------------
#> Rank measure used: median 
#> 
#> Posterior summaries for model parameters:
#> # A tibble: 9 × 10
#>   variable                   mean median    sd   mad     q5   q95  rhat ess_bulk ess_tail
#>   <chr>                     <dbl>  <dbl> <dbl> <dbl>  <dbl> <dbl> <dbl>    <dbl>    <dbl>
#> 1 logStrength[1, AC Milan]  2.16   2.15  0.882 0.874  0.783 3.66   1.01     280.     852.
#> 2 logStrength[2, AC Milan]  1.40   1.40  0.993 1.03  -0.212 3.05   1.01     174.     435.
#> 3 logStrength[3, AC Milan]  2.43   2.44  1.16  1.18   0.437 4.33   1.03     134.     309.
#> 4 logStrength[4, AC Milan]  2.91   2.86  1.35  1.40   0.766 5.14   1.03     131.     310.
#> 5 logStrength[1, AS Roma]   1.48   1.49  0.839 0.831  0.068 2.84   1.00     261.     862.
#> 6 logStrength[2, AS Roma]   0.018  0.011 0.972 0.956 -1.56  1.63   1.01     161.     364.
#> 7 logStrength[3, AS Roma]   0.939  0.956 1.11  1.13  -0.939 2.74   1.03     123.     281.
#> 8 logStrength[4, AS Roma]   1.1    1.04  1.29  1.33  -0.936 3.25   1.03     122.     318.
#> 9 logTie                   -0.008 -0.009 0.087 0.087 -0.148 0.133  1.00    1230.    1339.
# Static Ranking Example with MAP Rank Measure
fit_result_stat <- btd_foot(
  data = italy_2020_2021,
  dynamic_rank = FALSE,
  rank_measure = "map",
  iter_sampling = 1000,
  # parallel_chains = 2,
  chains = 2
)
print(fit_result_stat,
  pars = c("logStrength", "logTie"),
  teams = c("AC Milan", "AS Roma")
)
#> Bayesian Bradley-Terry-Davidson model
#> ------------------------------------------------
#> Rank measure used: map 
#> 
#> Top teams based on relative log-strengths:
#>    periods            team log_strengths
#> 11       1           Inter        2.2973
#> 7        1        AC Milan        2.0472
#> 13       1      SSC Napoli        1.7988
#> 6        1        Juventus        1.5070
#> 20       1        Atalanta        1.2301
#> 19       1      Lazio Roma        1.0167
#> 15       1         AS Roma        0.9186
#> 5        1 Sassuolo Calcio        0.5942
#> 1        1  ACF Fiorentina        0.1174
#> 2        1   Hellas Verona        0.0565
#> 
#> Posterior summaries for model parameters:
#> # A tibble: 3 × 10
#>   variable                mean median    sd   mad     q5   q95  rhat ess_bulk ess_tail
#>   <chr>                  <dbl>  <dbl> <dbl> <dbl>  <dbl> <dbl> <dbl>    <dbl>    <dbl>
#> 1 logStrength[AC Milan]  1.92   1.95  0.729 0.735  0.714 3.12   1.02    109.      263.
#> 2 logStrength[AS Roma]   0.735  0.765 0.706 0.699 -0.442 1.91   1.02     95.6     221.
#> 3 logTie                -0.117 -0.116 0.087 0.087 -0.26  0.023  1.00   1229.     1265.

Another significant feature implemented in the btd_foot function is the ability to account for the home-field effect. This effect represents the psychological or logistical benefits that the home team may have over the visiting team. This is modeled as a multiplicative order effect Davidson and Beaver (1977) in the Bayesian BTD framework. This multiplicative parameter becomes in the log-scale an additive term as show \[\begin{align} p_{ij}^W &= \dfrac{\exp(\psi_i + \text{home})}{\exp(\psi_i + \text{home})+\exp(\psi_j)+\exp(\nu+ (\psi_i+ \text{home} + \psi_j)/2)},\\ p_{ij}^D &= \dfrac{\exp(\nu+(\psi_i + \text{home} + \psi_j)/2)}{\exp(\psi_i+ \text{home})+\exp(\psi_j)+\exp(\nu+ (\psi_i + \text{home} + \psi_j)/2)},\\ p_{ij}^L &= \dfrac{\exp(\psi_j)}{\exp(\psi_i + \text{home})+\exp(\psi_j)+\exp(\nu+ (\psi_i + \text{home} + \psi_j)/2)}, \end{align}\] where \(\text{home}\) is the home effect parameter. The argument home_effect permits to consider this extension of the Bayesian BTD where an additional parameter representing the home effect is considered:

The prior_par argument allows the user to specify a list of prior distributions for the model parameters. The priors must be normal distributions. The default priors are the following one \[\begin{align} \psi &\sim \mathrm{N}(0,3),\\ \nu &\sim \mathrm{N}(0,0.3), \\ \text{home} & \sim \mathrm{N}(0,5) \end{align}\]

Below is an example of how to incorporate the home-field effect into the btd_foot function by modifying the priors and enabling the home_effect parameter.

# Dynamic Ranking Example with Median Rank Measure
fit_result_dyn_2 <- btd_foot(
  data = italy_2020_2021,
  home_effect = TRUE,
  dynamic_rank = TRUE,
  prior_par = list(
    logStrength = normal(2, 10),
    logTie = normal(-1.5, 5),
    home = normal(0, 5)
  ),
  rank_measure = "median",
  iter_sampling = 1000,
  # parallel_chains = 2,
  chains = 2,
  adapt_delta = 0.9,
  max_treedepth = 12
)
print(fit_result_dyn_2,
  display = "parameters",
  pars = c("logTie", "home")
)
#> Bayesian Bradley-Terry-Davidson model
#> ------------------------------------------------
#> Rank measure used: median 
#> 
#> Posterior summaries for model parameters:
#> # A tibble: 2 × 10
#>   variable  mean median    sd   mad     q5   q95  rhat ess_bulk ess_tail
#>   <chr>    <dbl>  <dbl> <dbl> <dbl>  <dbl> <dbl> <dbl>    <dbl>    <dbl>
#> 1 logTie   0.025  0.028 0.093 0.092 -0.129 0.177  1.00    2076.    1335.
#> 2 home     0.229  0.228 0.111 0.114  0.043 0.411  1       2434.    1115.
# Static Ranking Example with MAP Rank Measure
fit_result_stat_2 <- btd_foot(
  data = italy_2020_2021,
  home_effect = TRUE,
  dynamic_rank = FALSE,
  prior_par = list(
    logStrength = normal(2, 10),
    logTie = normal(0, 2.5),
    home = normal(5, 3)
  ),
  rank_measure = "map",
  iter_sampling = 1000,
  # parallel_chains = 2,
  chains = 2
)
print(fit_result_stat_2,
  pars = c("logTie", "home")
)
#> Bayesian Bradley-Terry-Davidson model
#> ------------------------------------------------
#> Rank measure used: map 
#> 
#> Top teams based on relative log-strengths:
#>    periods            team log_strengths
#> 11       1           Inter          4.14
#> 7        1        AC Milan          3.67
#> 13       1      SSC Napoli          3.17
#> 6        1        Juventus          3.13
#> 20       1        Atalanta          2.56
#> 15       1         AS Roma          2.42
#> 19       1      Lazio Roma          2.27
#> 5        1 Sassuolo Calcio          1.83
#> 1        1  ACF Fiorentina          1.64
#> 8        1       Torino FC          1.42
#> 
#> Posterior summaries for model parameters:
#> # A tibble: 2 × 10
#>   variable   mean median    sd   mad     q5   q95  rhat ess_bulk ess_tail
#>   <chr>     <dbl>  <dbl> <dbl> <dbl>  <dbl> <dbl> <dbl>    <dbl>    <dbl>
#> 1 logTie   -0.114 -0.114 0.09  0.09  -0.256 0.039  1.00     405.     458.
#> 2 home      0.205  0.203 0.102 0.102  0.034 0.37   1.00     335.     366.

The btd_foot function offers flexibility in how the posterior distributions are obtained by allowing users to choose the estimation method via the method argument. This argument enables the user to balance computational efficiency with the precision of uncertainty estimates. The available options include:

Below is an example that demonstrates how to use Variational Inference as the estimation method:

# Variational Inference Example
fit_result_vi <- btd_foot(
  data = italy_2020_2021,
  dynamic_rank = TRUE,
  rank_measure = "mean",
  method = "VI"
)

Visualization tools

After fitting the model, the btd_foot function outputs an object of class btdFoot. This object can be used by the plot_btdPosterior function, which is designed to visualize the posterior distributions for the log-strenghts, log-tie and home effect parameters, providing insights into the uncertainty and variability of estimates. Depending on the specifications, it can generate either boxplots or density plots of the posterior distributions. In the case of a dynamic Bayesian BTD model, the function produces a sequence of boxplots for each time period and for each team specified in the teams argument, or for all analyzed teams if this argument is omitted.

By default, plot_btdPosterior generates boxplots of the posterior distributions of teams’ log-strengths. This is particularly useful for comparing the central tendency and variability of team strengths across different periods or between teams.

# Dynamic Ranking

plot_btdPosterior(fit_result_dyn)
plot of chunk plot_btdPosterior_dyn

plot of chunk plot_btdPosterior_dyn

# Static Ranking

plot_btdPosterior(fit_result_stat)
plot of chunk plot_btdPosterior_stat

plot of chunk plot_btdPosterior_stat

It is also possible to select specific teams to plot by using the teams argument

# Dynamic Ranking

plot_btdPosterior(fit_result_dyn,
  teams = c("AC Milan", "AS Roma", "Juventus", "Inter"),
  ncol = 2
)
plot of chunk plot_btdPosterior_teams_dyn

plot of chunk plot_btdPosterior_teams_dyn

# Static Ranking

plot_btdPosterior(fit_result_stat,
  teams = c("AC Milan", "AS Roma", "Juventus", "Inter"),
  ncol = 2
)
plot of chunk plot_btdPosterior_teams_stat

plot of chunk plot_btdPosterior_teams_stat

Additionally, it is possible to plot the posterior density by setting plot_type = "density"

# Dynamic Ranking

plot_btdPosterior(fit_result_dyn,
  teams = c("AC Milan", "AS Roma", "Juventus", "Inter"),
  plot_type = "density",
  scales = "free_y"
)
plot of chunk plot_btdPosterior_teams_dyn_dens

plot of chunk plot_btdPosterior_teams_dyn_dens

# Static Ranking

plot_btdPosterior(fit_result_stat,
  teams = c("AC Milan", "AS Roma", "Juventus", "Inter"),
  plot_type = "density",
  scales = "free_y"
)
plot of chunk plot_btdPosterior_teams_stat_dens

plot of chunk plot_btdPosterior_teams_stat_dens

Furthermore, the plot_logStrength function allows for plotting team rankings based on different summary measures of the posterior log-strengths, including the median, mean, and maximum a posteriori (MAP) estimates used in the btd_foot function. Specifically, this function generates a ggplot object that represents the log-strength values of teams, facilitating the comparison of team performance across different periods or among various teams. Similar to the plot_btdPosterior function, specific teams can be selected using the teams argument.

# Dynamic Ranking

plot_logStrength(fit_result_dyn,
  teams = c("AC Milan", "AS Roma", "Juventus", "Inter")
)
plot of chunk plot_logStrength_teams_dyn

plot of chunk plot_logStrength_teams_dyn

Goal-based models fit

Static fit

To start with some analysis, let’s now italy data about the Italian Serie A, specifically season 2000/2001: the season consists of \(T=18\) teams, we start fitting a static bivariate Poisson model using:

At this stage, we are currently ignoring any time-dependence in our parameters, considering them to be static across distinct match-times.

### Use Italian Serie A 2000/2001

## with 'dplyr' environment
#
# library(dplyr)
# italy <- as_tibble(italy)
# italy_2000<- italy %>%
#  dplyr::select(Season, home, visitor, hgoal,vgoal) #%>%
#  dplyr::filter(Season=="2000")
# italy_2000

## alternatively, you can use the basic 'subsetting' code,
## not using the 'dplyr' environment:
data("italy")
italy <- as.data.frame(italy)
italy_2000 <- subset(
  italy[, c(2, 3, 4, 6, 7)],
  Season == "2000"
)

colnames(italy_2000) <- c("periods", "home_team", "away_team", "home_goals", "away_goals")


### Fit Stan models
## no dynamics, no predictions
## 4 Markov chains, 'n_iter' iterations each

n_iter <- 200 # number of MCMC iterations after the burn-in
fit1_stan <- stan_foot(
  data = italy_2000,
  model = "biv_pois",
  chains = 4,
  # parallel_chains = 4,
  iter_sampling = n_iter
) # biv poisson

Similarly to the btd_foot function, the custom print function for an object of class stanFoot provides the usual Bayesian model summaries , including posterior means, medians, standard deviations, percentiles at 2.5%, 25%, 75%, 97.5% level, effective sample size (n_eff) and Gelman-Rubin statistic (Rhat). It accepts the same arguments as described previously.

## Print of model summary for parameters:

print(fit1_stan,
  pars = c(
    "home", "rho", "sigma_att",
    "sigma_def", "att", "def"
  ),
  teams = c("AC Milan", "AS Roma")
)
#> Summary of Stan football model
#> ------------------------------
#> 
#> Posterior summaries for model parameters:
#> # A tibble: 8 × 10
#>   variable        mean median    sd   mad     q5    q95  rhat ess_bulk ess_tail
#>   <chr>          <dbl>  <dbl> <dbl> <dbl>  <dbl>  <dbl> <dbl>    <dbl>    <dbl>
#> 1 home           0.283  0.287 0.067 0.064  0.164  0.386  1.00     809.    579. 
#> 2 rho           -1.75  -1.71  0.314 0.301 -2.34  -1.32   1.01     683.    380. 
#> 3 sigma_att      0.207  0.201 0.065 0.061  0.108  0.318  1.02     240.    296. 
#> 4 sigma_def      0.223  0.218 0.072 0.067  0.116  0.348  1.03     167.     72.9
#> 5 att[AS Roma]   0.262  0.266 0.134 0.141  0.03   0.478  1.01     541.    558. 
#> 6 att[AC Milan]  0.131  0.129 0.116 0.114 -0.042  0.327  1.01     844.    422. 
#> 7 def[AS Roma]  -0.2   -0.187 0.147 0.14  -0.454  0.018  1.01     693.    303. 
#> 8 def[AC Milan]  0      0.004 0.139 0.137 -0.239  0.227  1.00    1486.    633.

The Gelman-Rubin statistic \(\hat{R}\) (Rhat) is below the threshold 1.1 for all the parameters, whereas the effective sample size (n_eff), measuring the approximate number of iid replications from the Markov chains, does not appear to be problematic. Thus, HMC sampling reached the convergence.

As we could expect, there is a positive effect from the home-effect (posterior mean about 0.3), and this implies that if two teams are equally good (meaning that their attack and defence abilities almost coincide), assuming that the constant intercept \(\mu \approx 0\), we get that the average number of goals for the home-team will be \(\lambda_{1} = \exp \{0.3 \} \approx 1.35\), against \(\lambda_{2} = \exp \{0 \} = 1\).

In the model above, we are assuming that the covariance \(\lambda_{3n}\) is constant and not depending on the match and/or on teams characteristics/further covariates:

\[\begin{align} \lambda_{3n} =&\ \exp\{\rho\}\\ \rho \sim & \ \mathrm{N}^+(0,1),\\ \end{align}\]

where \(\rho\) is assigned an half-Gaussian distribution with standard deviation equal to 1. According to the fit above, this means that in the model above we get an estimate of \(\lambda_{3n}= \exp\{-4.25\} \approx 0.014\), suggesting a low, despite non-null, amount of goals-correlation existing in the 2000/2001 Italian Serie A. Of course, in the next package’s version, the user will be allowed to specify a more general linear predictor for \(\log(\lambda_{3n})\), as outlined in the BP presentation above, along with some prior distributions for the parameters involved in the covariance formulation.

We can also depict the marginal posterior distributions for \(\rho\) (and eventually for the other fixed-effects parameters) using the bayesplot package for Bayesian visualizations:

## Marginal posterior with bayesplot

posterior1 <- fit1_stan$fit$draws(format = "matrix")
mcmc_areas(posterior1, pars = c(
  "home", "rho",
  "sigma_att", "sigma_def"
)) +
  theme_bw()
plot of chunk static_fit_corr

plot of chunk static_fit_corr

We can also access the original BP Stan code by typing:

### Model's code extraction

fit1_stan$stan_code
#>   [1] "functions{"                                                                                                              
#>   [2] ""                                                                                                                        
#>   [3] "      real bipois_lpmf(array[] int r , real mu1,real mu2,real mu3) {"                                                    
#>   [4] "        real ss;"                                                                                                        
#>   [5] "        real log_s;"                                                                                                     
#>   [6] "        real mus;"                                                                                                       
#>   [7] "        int  miny;"                                                                                                      
#>   [8] ""                                                                                                                        
#>   [9] "        miny = min(r[1], r[2]);"                                                                                         
#>  [10] ""                                                                                                                        
#>  [11] "        ss = poisson_lpmf(r[1] | mu1) + poisson_lpmf(r[2] | mu2) -"                                                      
#>  [12] "          exp(mu3);"                                                                                                     
#>  [13] "        if(miny > 0) {"                                                                                                  
#>  [14] "          mus = -mu1-mu2+mu3;"                                                                                           
#>  [15] "          log_s = ss;"                                                                                                   
#>  [16] ""                                                                                                                        
#>  [17] "          for(k in 1:miny) {"                                                                                            
#>  [18] "            log_s = log_s + log(r[1] - k + 1) + mus"                                                                     
#>  [19] "            + log(r[2] - k + 1)"                                                                                         
#>  [20] "            - log(k);"                                                                                                   
#>  [21] "            ss = log_sum_exp(ss, log_s);"                                                                                
#>  [22] "          }"                                                                                                             
#>  [23] "        }"                                                                                                               
#>  [24] "        return(ss);"                                                                                                     
#>  [25] "      }"                                                                                                                 
#>  [26] ""                                                                                                                        
#>  [27] "    }"                                                                                                                   
#>  [28] "    data{"                                                                                                               
#>  [29] "      int N;   // number of games"                                                                                       
#>  [30] "      int<lower=0> N_prev;"                                                                                              
#>  [31] "      array[N,2] int y;"                                                                                                 
#>  [32] "      int nteams;"                                                                                                       
#>  [33] "      array[N] int instants_rank;"                                                                                       
#>  [34] "      int ntimes_rank;                 // dynamic periods for ranking"                                                   
#>  [35] "      array[N] int team1;"                                                                                               
#>  [36] "      array[N] int team2;"                                                                                               
#>  [37] "      array[N_prev]int team1_prev;"                                                                                      
#>  [38] "      array[N_prev] int team2_prev;"                                                                                     
#>  [39] "      matrix[ntimes_rank,nteams] ranking;"                                                                               
#>  [40] "      int<lower=0, upper=1> ind_home;"                                                                                   
#>  [41] "      real mean_home;              // Mean for home effect"                                                              
#>  [42] "      real<lower=1e-8> sd_home;      // Standard deviation for home effect"                                              
#>  [43] ""                                                                                                                        
#>  [44] "      // priors part"                                                                                                    
#>  [45] "      int<lower=1,upper=4> prior_dist_num;    // 1 gaussian, 2 t, 3 cauchy, 4 laplace"                                   
#>  [46] "      int<lower=1,upper=4> prior_dist_sd_num; // 1 gaussian, 2 t, 3 cauchy, 4 laplace"                                   
#>  [47] ""                                                                                                                        
#>  [48] "      real<lower=0> hyper_df;"                                                                                           
#>  [49] "      real hyper_location;"                                                                                              
#>  [50] ""                                                                                                                        
#>  [51] "      real<lower=0> hyper_sd_df;"                                                                                        
#>  [52] "      real hyper_sd_location;"                                                                                           
#>  [53] "      real<lower=1e-8> hyper_sd_scale;"                                                                                  
#>  [54] "    }"                                                                                                                   
#>  [55] "    parameters{"                                                                                                         
#>  [56] "      vector[nteams] att_raw;"                                                                                           
#>  [57] "      vector[nteams] def_raw;"                                                                                           
#>  [58] "      real<lower=1e-8> sigma_att;"                                                                                       
#>  [59] "      real<lower=1e-8> sigma_def;"                                                                                       
#>  [60] "      real home;"                                                                                                        
#>  [61] "      real rho;"                                                                                                         
#>  [62] "      real gamma;"                                                                                                       
#>  [63] "    }"                                                                                                                   
#>  [64] "    transformed parameters{"                                                                                             
#>  [65] "      real adj_h_eff;                   // Adjusted home effect"                                                         
#>  [66] "      vector[nteams] att;"                                                                                               
#>  [67] "      vector[nteams] def;"                                                                                               
#>  [68] "      array[N] vector[3] theta;"                                                                                         
#>  [69] ""                                                                                                                        
#>  [70] "      for (t in 1:nteams){"                                                                                              
#>  [71] "        att[t] = att_raw[t]-mean(att_raw);"                                                                              
#>  [72] "        def[t] = def_raw[t]-mean(def_raw);"                                                                              
#>  [73] "      }"                                                                                                                 
#>  [74] ""                                                                                                                        
#>  [75] "      adj_h_eff = home * ind_home;"                                                                                      
#>  [76] ""                                                                                                                        
#>  [77] "      for (n in 1:N){"                                                                                                   
#>  [78] "        theta[n,1] = exp(adj_h_eff+att[team1[n]]+def[team2[n]]+"                                                         
#>  [79] "                         (gamma/2)*(ranking[instants_rank[n], team1[n]]-ranking[instants_rank[n], team2[n]]));"          
#>  [80] "        theta[n,2] = exp(att[team2[n]]+def[team1[n]]-"                                                                   
#>  [81] "                         (gamma/2)*(ranking[instants_rank[n], team1[n]]-ranking[instants_rank[n], team2[n]]));"          
#>  [82] "        theta[n,3] = exp(rho);"                                                                                          
#>  [83] "      }"                                                                                                                 
#>  [84] "    }"                                                                                                                   
#>  [85] "    model{"                                                                                                              
#>  [86] "      // log-priors for team-specific abilities"                                                                         
#>  [87] "      for (t in 1:(nteams)){"                                                                                            
#>  [88] "        if (prior_dist_num == 1){"                                                                                       
#>  [89] "          target+= normal_lpdf(att_raw[t]|hyper_location, sigma_att);"                                                   
#>  [90] "          target+= normal_lpdf(def_raw[t]|hyper_location, sigma_def);"                                                   
#>  [91] "        }"                                                                                                               
#>  [92] "        else if (prior_dist_num == 2){"                                                                                  
#>  [93] "          target+= student_t_lpdf(att_raw[t]|hyper_df, hyper_location, sigma_att);"                                      
#>  [94] "          target+= student_t_lpdf(def_raw[t]|hyper_df, hyper_location, sigma_def);"                                      
#>  [95] "        }"                                                                                                               
#>  [96] "        else if (prior_dist_num == 3){"                                                                                  
#>  [97] "          target+= cauchy_lpdf(att_raw[t]|hyper_location, sigma_att);"                                                   
#>  [98] "          target+= cauchy_lpdf(def_raw[t]|hyper_location, sigma_def);"                                                   
#>  [99] "        }"                                                                                                               
#> [100] "        else if (prior_dist_num == 4){"                                                                                  
#> [101] "          target+= double_exponential_lpdf(att_raw[t]|hyper_location, sigma_att);"                                       
#> [102] "          target+= double_exponential_lpdf(def_raw[t]|hyper_location, sigma_def);"                                       
#> [103] "        }"                                                                                                               
#> [104] "      }"                                                                                                                 
#> [105] ""                                                                                                                        
#> [106] ""                                                                                                                        
#> [107] "      // log-hyperpriors for sd parameters"                                                                              
#> [108] "      if (prior_dist_sd_num == 1 ){"                                                                                     
#> [109] "        target+=normal_lpdf(sigma_att|hyper_sd_location, hyper_sd_scale);"                                               
#> [110] "        target+=normal_lpdf(sigma_def|hyper_sd_location, hyper_sd_scale);"                                               
#> [111] "      }"                                                                                                                 
#> [112] "      else if (prior_dist_sd_num == 2){"                                                                                 
#> [113] "        target+=student_t_lpdf(sigma_att|hyper_sd_df, hyper_sd_location, hyper_sd_scale);"                               
#> [114] "        target+=student_t_lpdf(sigma_def|hyper_sd_df, hyper_sd_location, hyper_sd_scale);"                               
#> [115] "      }"                                                                                                                 
#> [116] "      else if (prior_dist_sd_num == 3){"                                                                                 
#> [117] "        target+=cauchy_lpdf(sigma_att|hyper_sd_location, hyper_sd_scale);"                                               
#> [118] "        target+=cauchy_lpdf(sigma_def|hyper_sd_location, hyper_sd_scale);"                                               
#> [119] "      }"                                                                                                                 
#> [120] "      else if (prior_dist_sd_num == 4){"                                                                                 
#> [121] "        target+=double_exponential_lpdf(sigma_att|hyper_sd_location, hyper_sd_scale);"                                   
#> [122] "        target+=double_exponential_lpdf(sigma_def|hyper_sd_location, hyper_sd_scale);"                                   
#> [123] "      }"                                                                                                                 
#> [124] ""                                                                                                                        
#> [125] "      // log-priors fixed effects"                                                                                       
#> [126] "      target+=normal_lpdf(home|mean_home,sd_home);"                                                                      
#> [127] "      target+=normal_lpdf(rho|0,1);"                                                                                     
#> [128] "      target+=normal_lpdf(gamma|0,1);"                                                                                   
#> [129] ""                                                                                                                        
#> [130] "      // likelihood"                                                                                                     
#> [131] "      for (n in 1:N){"                                                                                                   
#> [132] "        //target+=bipois_lpmf(y[n,]| theta[n,1],"                                                                        
#> [133] "        //                    theta[n,2], theta[n,3]);"                                                                  
#> [134] "        target+=poisson_lpmf(y[n,1]| theta[n,1]+theta[n,3]);"                                                            
#> [135] "        target+=poisson_lpmf(y[n,2]| theta[n,2]+theta[n,3]);"                                                            
#> [136] "      }"                                                                                                                 
#> [137] "    }"                                                                                                                   
#> [138] "    generated quantities{"                                                                                               
#> [139] "      array[N,2]int y_rep;"                                                                                              
#> [140] "      array[N_prev,2] int y_prev;"                                                                                       
#> [141] "      array[N_prev] vector[3] theta_prev;"                                                                               
#> [142] "      vector[N] log_lik;"                                                                                                
#> [143] "      array[N] int diff_y_rep;"                                                                                          
#> [144] ""                                                                                                                        
#> [145] "      //in-sample replications"                                                                                          
#> [146] "      for (n in 1:N){"                                                                                                   
#> [147] "        y_rep[n,1] = poisson_rng(theta[n,1]+theta[n,3]);"                                                                
#> [148] "        y_rep[n,2] = poisson_rng(theta[n,2]+theta[n,3]);"                                                                
#> [149] "        diff_y_rep[n] = y_rep[n,1] - y_rep[n,2];"                                                                        
#> [150] "        log_lik[n] = poisson_lpmf(y[n,1]| theta[n,1]+theta[n,3])+"                                                       
#> [151] "                     poisson_lpmf(y[n,2]| theta[n,2]+theta[n,3]);"                                                       
#> [152] "        //bipois_lpmf(y[n,]| theta[n,1],"                                                                                
#> [153] "        //                        theta[n,2], theta[n,3]);"                                                              
#> [154] "      }"                                                                                                                 
#> [155] ""                                                                                                                        
#> [156] "      //out-of-sample predictions"                                                                                       
#> [157] "      if (N_prev > 0) {"                                                                                                 
#> [158] "        for (n in 1:N_prev){"                                                                                            
#> [159] "          theta_prev[n,1] = exp(adj_h_eff+att[team1_prev[n]]+"                                                           
#> [160] "                                  def[team2_prev[n]]+"                                                                   
#> [161] "                           (gamma/2)*(ranking[instants_rank[N],team1_prev[n]]-ranking[instants_rank[N],team2_prev[n]]));"
#> [162] "          theta_prev[n,2] = exp(att[team2_prev[n]]+"                                                                     
#> [163] "                                  def[team1_prev[n]]-"                                                                   
#> [164] "                           (gamma/2)*(ranking[instants_rank[N],team1_prev[n]]-ranking[instants_rank[N],team2_prev[n]]));"
#> [165] "          theta_prev[n,3] = exp(rho);"                                                                                   
#> [166] "          y_prev[n,1] = poisson_rng(theta_prev[n,1]+theta_prev[n,3]);"                                                   
#> [167] "          y_prev[n,2] = poisson_rng(theta_prev[n,2]+theta_prev[n,3]);"                                                   
#> [168] "        }"                                                                                                               
#> [169] "      }"                                                                                                                 
#> [170] "    }"

In addition the stan_foot function provides flexibility in obtaining posterior distributions by allowing users to choose the estimation method via the method argument. The available options include:

Below is an example that demonstrates how to use Pathfinder algorithm as the estimation method:

# Pathfinder algorithm example
fit1_stan_path <- stan_foot(
  data = italy_2000,
  model = "biv_pois",
  method = "pathfinder"
) # biv poisson

We fit now the same model under the MLE approach with Wald-type confidence intervals. We can then print the MLE estimates, e.g for the parameters \(\rho\) and \(\text{home}\):

### Fit MLE models
## no dynamics, no predictions
## Wald intervals

fit1_mle <- mle_foot(
  data = italy_2000,
  model = "biv_pois",
  interval = "Wald"
) # mle biv poisson
fit1_mle$home_effect
#>      2.5% mle 97.5%
#> [1,]  0.2 0.3  0.39

We got a very similar estimate to the Bayesian model for the home-effect.

Changing default priors

One of the common practices in Bayesian statistics is to change the priors and perform some sensitivity tests. The default priors for the team-specific abilities and their related team-level standard deviations are:

\[\begin{align} \text{att}_t &\sim \mathrm{N}(\mu_{\text{att}}, \sigma_{\text{att}}),\\ \text{def}_t &\sim \mathrm{N}(\mu_{\text{def}}, \sigma_{\text{def}}),\\ \sigma_{\text{att}}, \sigma_{\text{def}} &\sim \mathsf{Cauchy}^+(0,5), \end{align}\]

where \(\mathsf{Cauchy}^+\) denotes the half-Cauchy distribution with support \([0, +\infty)\). However, the user is free to elicit some different priors, possibly choosing one among the following distributions: Gaussian (normal), student-\(t\) (student_t), Cauchy (cauchy) and Laplace (laplace). The ability optional argument allows to specify the priors for the team-specific parameters \(\text{att}\) and \(\text{def}\), whereas the optional argument ability_sd allows to assign a prior to the group-level standard deviations \(\sigma_{\text{att}}, \sigma_{\text{def}}\). For instance, for each team \(t, t=1,\ldots,T\), we could consider:

\[\begin{align} \text{att}_t &\sim t(4, \mu_{\text{att}}, \sigma_{\text{att}}),\\ \text{def}_t &\sim t(4, \mu_{\text{def}}, \sigma_{\text{def}}),\\ \sigma_{\text{att}}, \sigma_{\text{def}} &\sim \mathsf{Laplace}^+(0,1), \end{align}\]

where \(t(\text{df}, \mu, \sigma)\) denotes a student-\(t\) distribution with df degrees of freedom, location \(\mu\) and scale \(\sigma\), whereas \(\mathsf{Laplace}^+\) denotes a half-Laplace distribution.

### Fit Stan models
## changing priors
## student-t for team-specific abilities, laplace for sds

fit1_stan_t <- stan_foot(
  data = italy_2000,
  model = "biv_pois",
  chains = 4,
  prior_par = list(
    ability = student_t(4, 0, NULL),
    ability_sd = laplace(0, 1),
    home = normal(0, 10)
  ),
  # parallel_chains = 4,
  iter_sampling = n_iter
) # biv poisson

Then, we can compare the marginal posteriors from the two models, the one with Gaussian team-specific abilities and the default \(\mathsf{Cauchy}(0,5)\) for the team-level sds, and the other one specified above, with student-\(t\) distributed team-specific abilities and the \(\mathsf{Laplace}^+(0,1)\) for the team-level sds. We depict here the comparison for the attack team-level sds only:

## comparing posteriors

posterior1_t <- fit1_stan_t$fit$draws(format = "matrix")
model_names <- c("Default", "Stud+Laplace")
color_scheme_set(scheme = "gray")
gl_posterior <- cbind(
  posterior1[, "sigma_att"],
  posterior1_t[, "sigma_att"]
)
colnames(gl_posterior) <- c("sigma_att", "sigma_att_t")
mcmc_areas(gl_posterior, pars = c("sigma_att", "sigma_att_t")) +
  xaxis_text(on = TRUE, size = ggplot2::rel(2.9)) +
  yaxis_text(on = TRUE, size = ggplot2::rel(2.9)) +
  scale_y_discrete(labels = ((parse(text = model_names)))) +
  ggtitle("Att/def sds") +
  theme(plot.title = element_text(hjust = 0.5, size = rel(2.6))) +
  theme_bw()
plot of chunk comparing_priors

plot of chunk comparing_priors

The student+laplace prior induces a lower amount of group-variability in the \(\sigma_{\text{att}}\) marginal posterior distribution (then, a larger shrinkage towards the grand mean \(\mu_{\text{att}}\)).

When specifying the prior for the team-specific parameters through the argument ability, you are not allowed to fix the group-level standard deviations \(\sigma_{\text{att}}, \sigma_{\text{def}}\) to some numerical values. Rather, they need to be assigned a reasonable prior distribution. For such reason, the most appropriate specification for the ability argument is ability = 'dist'(0, NULL), where the scale argument is set to NULL (otherwise, a warning message is occurring).

Dynamic fit

A structural limitation in the previous models is the assumption of static team-specific parameters, namely teams are assumed to have a constant performance across time, as determined by the attack and defence abilities (att, def). However, teams’ performance tend to be dynamic and change across different years, if not different weeks. Many factors contribute to this football aspect:

  1. teams act during the summer/winter players’ transfermarket, by dramatically changing their rosters;

  2. some teams’ players could be injured in some periods, by affecting the global quality of the team in some matches;

  3. coaches could be dismissed from their teams due to some non satisfactory results;

  4. some teams could improve/worsen their attitudes due to the so-called turnover;

and many others. Again, we could assume dynamics in the attach/defence abilities as in Owen (2011) , Egidi, Pauli, and Torelli (2018) and Macrı̀ Demartino, Egidi, and Torelli (2024) in terms of weeks or seasons. In such framework, for a given number of times \(1, \ldots, \mathcal{T}\), the models above would be unchanged, but the priors for the abilities parameters at each time \(\tau, \tau=2,\ldots, \mathcal{T},\) would be auto-regressive of order 1:

\[\begin{align} \text{att}_{t, \tau} & \sim \mathrm{N}({\text{att}}_{t, \tau-1}, \sigma_{\text{att}})\\ \text{def}_{t, \tau} &\sim \mathrm{N}({\text{def}}_{t, \tau-1}, \sigma_{\text{def}}), \end{align}\]

whereas for \(\tau=1\) we have:

\[\begin{align} \text{att}_{t, 1} & \sim \mathrm{N}(\mu_{\text{att}}, \sigma_{\text{att}})\\ \text{def}_{t, 1} &\sim \mathrm{N}(\mu_{\text{def}}, \sigma_{\text{def}}), \end{align}\]

with hyperparameters \(\mu_{\text{att}}, \mu_{\text{def}}, \sigma_{\text{att}}, \sigma_{\text{def}}\) and with the standard deviations capturing the time’s/evolution’s variation of both the teams’ skills (here assumed to be constant across time and teams). Of course, the identifiability constraint must be imposed for each time \(\tau\).

We can use the dynamic_type argument in the stan_foot function, with possible options 'seasonal' or 'weekly' in order to consider more seasons (no examples are given in this course) or more week-times within a single season, respectively. Let’s fit a weekly-dynamic parameters model on the Serie A 2000/2001 season:

### Fit Stan models
## seasonal dynamics, no predictions
## 2 Markov chains, 'n_iter' iterations each

fit2_stan <- stan_foot(
  data = italy_2000,
  model = "biv_pois",
  dynamic_type = "weekly",
  # parallel_chains = 2,
  chains = 2,
  iter_sampling = n_iter
) # biv poisson
print(fit2_stan, pars = c(
  "home", "rho", "sigma_att",
  "sigma_def"
))
#> Summary of Stan football model
#> ------------------------------
#> 
#> Posterior summaries for model parameters:
#> # A tibble: 4 × 10
#>   variable    mean median    sd   mad     q5    q95  rhat ess_bulk ess_tail
#>   <chr>      <dbl>  <dbl> <dbl> <dbl>  <dbl>  <dbl> <dbl>    <dbl>    <dbl>
#> 1 home       0.28   0.286 0.069 0.072  0.168  0.387  1.00   521.      285. 
#> 2 rho       -1.76  -1.71  0.367 0.301 -2.39  -1.30   1.00   460.      251. 
#> 3 sigma_att  0.048  0.047 0.009 0.007  0.035  0.066  1.22     7.20     11.4
#> 4 sigma_def  0.058  0.056 0.012 0.012  0.042  0.08   1.30     5.56     24.1

From the printed summary, we may note that the degree of goals’ correlation seems to be again very small here. Moreover, the Gelman-Rubin statistic for \(\sigma_{\text{att}}\) is relatively high, whereas the effective sample sizes for \(\sigma_{\text{att}}\) and \(\sigma_{\text{def}}\) are quite low. This is suggesting possible inefficiencies during the HMC sampling and that a model-reparametrization could be suited and effective at this stage. Another option is to play a bit with the prior specification for \(\sigma_{\text{att}}\) and \(\sigma_{\text{def}}\), for instance by specifying a prior inducing less shrinkage in the team-specific abilities.

To deal with these issues and broaden the set of candidate models, let’s fit also a dynamic double-Poisson model with the double_pois option for the argument model:

### Fit Stan models
## weekly dynamics, no predictions
## 2 chains, 'n_iter' iterations each

fit3_stan <- stan_foot(
  data = italy_2000,
  model = "double_pois",
  dynamic_type = "weekly",
  # parallel_chains = 2,
  chains = 2,
  iter_sampling = n_iter
) # double poisson
print(fit3_stan, pars = c(
  "home", "sigma_att",
  "sigma_def"
))
#> Summary of Stan football model
#> ------------------------------
#> 
#> Posterior summaries for model parameters:
#> # A tibble: 3 × 10
#>   variable   mean median    sd   mad    q5   q95  rhat ess_bulk ess_tail
#>   <chr>     <dbl>  <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>    <dbl>    <dbl>
#> 1 home      0.407  0.408 0.052 0.056 0.32  0.494  1.01   412.      231. 
#> 2 sigma_att 0.049  0.046 0.014 0.015 0.032 0.075  1.85     3.04     36.9
#> 3 sigma_def 0.053  0.05  0.011 0.011 0.039 0.074  1.31     7.03     19.3

The fitting problems mentioned above remain also for the double Poisson model…Thus, it’s time to play a little bit with the prior distributions. Also in the dynamic approach we can change the default priors for the team-specific abilities and their standard deviations, respectively, through the optional arguments ability and ability_sd. The specification follows almost analogously the static case: with the first argument we may specify the prior’s family for the team-specific abilities and the specific priors for \(\text{att}_{t,1}, \text{def}_{t,1}\) along with the hyper-prior location \(\mu_{\text{att}}, \mu_{\text{def}}\), whereas \(\sigma_{\text{att}}\) and \(\sigma_{\text{def}}\) need to be assigned some proper prior distribution. Assume to fit the same double Poisson model, but here we suppose student-\(t\) distributed team-specific abilities with 4 degrees of freedom to eventually capture more extreme team-specific abilities (more variability, i.e. less shrinkage), along with a \(\text{Cauchy}^+(0,25)\) for their standard deviations (to better capture a possible larger evolution variability):

\[\begin{align} \text{att}_{t, \tau} &\ \sim t(4, {\text{att}}_{t, \tau-1}, \sigma_{\text{att}})\\ \text{def}_{t, \tau} &\ \sim t(4, {\text{def}}_{t, \tau-1}, \sigma_{\text{def}})\\ \sigma_{\text{att}}, \sigma_{\text{def}} &\ \sim \mathsf{Cauchy}^+(0,25), \end{align}\]

### Fit Stan models
## weekly dynamics, no predictions
## 2 chains, 'n_iter' iterations each

fit3_stan_t <- stan_foot(
  data = italy_2000,
  model = "double_pois",
  prior_par = list(
    ability = student_t(4, 0, NULL),
    ability_sd = cauchy(0, 25),
    home = normal(0, 5)
  ),
  dynamic_type = "weekly",
  # parallel_chains = 2,
  chains = 2,
  iter_sampling = n_iter
) # double poisson
print(fit3_stan_t, pars = c(
  "home", "sigma_att",
  "sigma_def"
))
#> Summary of Stan football model
#> ------------------------------
#> 
#> Posterior summaries for model parameters:
#> # A tibble: 3 × 10
#>   variable   mean median    sd   mad    q5   q95  rhat ess_bulk ess_tail
#>   <chr>     <dbl>  <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>    <dbl>    <dbl>
#> 1 home      0.41   0.411 0.043 0.041 0.333 0.48   1.03   131.      208. 
#> 2 sigma_att 0.033  0.032 0.008 0.009 0.022 0.046  1.47     3.98     53.3
#> 3 sigma_def 0.037  0.04  0.015 0.019 0.014 0.06   2.10     2.77     22.4

As we may conclude, the situation has been only slightly improved.

Using btd_foot and stan_foot sequentially

One of the main features of the footBayes package is the ability to first compute the log-strengths using the Bayesian Bradley-Terry-Davidson model through the btd_foot function and then fit a Bayesian goal-based model using the stan_foot by passing the previously estimated historical team strengths as an additional covariate via the ranking argument. Specifically, this argument accepts an element of class btdFootor alternatively a data frame with the following columns:

The ability to use either btdFoot objects or custom ranking data frames provides flexibility in how the user can model team strengths, allowing for the integration of various ranking systems, such as the FIFA ranking.

# Dynamic Bradley-Terry-Davidson model

data("italy")

italy_2020_2021_rank <- italy %>%
  dplyr::select(Season, home, visitor, hgoal, vgoal) %>%
  dplyr::filter(Season == "2020" | Season == "2021") %>%
  dplyr::mutate(match_outcome = dplyr::case_when(
    hgoal > vgoal ~ 1, # Home team wins
    hgoal == vgoal ~ 2, # Draw
    hgoal < vgoal ~ 3 # Away team wins
  )) %>%
  dplyr::filter(dplyr::row_number() <= 570) %>%
  dplyr::mutate(periods = dplyr::case_when(
    dplyr::row_number() <= 190 ~ 1,
    dplyr::row_number() <= 380 ~ 2,
    dplyr::row_number() <= 570 ~ 3
  )) %>%
  dplyr::select(periods,
    home_team = home,
    away_team = visitor, match_outcome
  )



fit_btd_dyn <- btd_foot(
  data = italy_2020_2021_rank,
  dynamic_rank = TRUE,
  rank_measure = "median",
  iter_sampling = 1000,
  # parallel_chains = 2,
  chains = 2,
  adapt_delta = 0.9,
  max_treedepth = 12
)

# Dynamic Bivariate Poisson Model

italy_2020_2021_fit <- italy %>%
  dplyr::select(Season, home, visitor, hgoal, vgoal) %>%
  dplyr::filter(Season == "2020" | Season == "2021") %>%
  dplyr::filter(dplyr::row_number() <= 570) %>%
  dplyr::mutate(periods = dplyr::case_when(
    dplyr::row_number() <= 190 ~ 1,
    dplyr::row_number() <= 380 ~ 2,
    dplyr::row_number() <= 570 ~ 3
  )) %>% # Assign periods based on match number
  dplyr::select(periods,
    home_team = home,
    away_team = visitor, home_goals = hgoal, away_goals = vgoal
  )



fit_stan_rank <- stan_foot(
  data = italy_2020_2021_fit,
  model = "biv_pois",
  ranking = fit_btd_dyn,
  predict = 180,
  prior_par = list(
    ability = student_t(4, 0, NULL),
    ability_sd = cauchy(0, 25),
    home = normal(0, 5)
  ),
  dynamic_type = "season",
  chains = 2,
  # parallel_chains = 2,
  iter_sampling = 1000
)
print(fit_stan_rank,
  pars = c("home", "rho", "sigma_att", "sigma_def")
)
#> Summary of Stan football model
#> ------------------------------
#> 
#> Posterior summaries for model parameters:
#> # A tibble: 4 × 10
#>   variable    mean median    sd   mad     q5    q95  rhat ess_bulk ess_tail
#>   <chr>      <dbl>  <dbl> <dbl> <dbl>  <dbl>  <dbl> <dbl>    <dbl>    <dbl>
#> 1 home       0.159  0.159 0.068 0.064  0.047  0.272  1      2342.    1669. 
#> 2 rho       -1.10  -1.09  0.162 0.163 -1.38  -0.849  1      2388.    1576. 
#> 3 sigma_att  0.025  0.02  0.02  0.015  0.004  0.064  1.06     16.1     42.5
#> 4 sigma_def  0.03   0.026 0.02  0.022  0.005  0.066  1.18     12.9     44.1

Model estimates and visualization tools

Plotting and interpreting team-specific abilities

Once the model has been fitted, there is a large amount of interesting summaries to explore. The function foot_abilities allows to depict posterior/confidence intervals for global attack and defense abilities on the considered data (attack abilities are plotted in red, whereas defense abilities in blue colors). The higher the attack and the lower the defence for a given team, and the better is the overall team’s strength.

We can produce the team-specific abilities for the two static fits above, fit1_mle (MLE) and fit1_stan (Bayes), with red bars for the attack and blue bars for the defence, respectively:

## Plotting abilities: credible and confidence 95% intervals

foot_abilities(object = fit1_stan, data = italy_2000)
foot_abilities(object = fit1_mle, data = italy_2000)
plot of chunk abilitiesplot of chunk abilities

plot of chunk abilities

AS Roma, the team winning the Serie A 2000/2001, is associated with the highest attack ability and the lowest defence ability according to both the models. In general, the models seem to well capture the static abilities: AS Roma, Lazio Roma and Juventus (1st, 3rd and 2nd at the end of that season, respectively) are rated as the best teams in terms of their abilities, whereas AS Bari, SSC Napoli and Vicenza Calcio (all relegated at the end of the season) have the worst abilities.

We can also depict the team-specific dynamic plots for the dynamic models:

## Plotting abilities: credible and confidence 95% intervals

foot_abilities(fit2_stan, italy_2000)
plot of chunk abilities_dyn

plot of chunk abilities_dyn

As we can see, dynamic abilities naturally evolve over the time: better teams (AS Roma, Lazio Roma, Juventus, Parma) are associated with increasing attack abilities and decreasing defence abilities, whereas the worst ones (AS Bari, SSC NApoli, and Hellas Verona) exhibit decreasing attacking skills and increasing defensive skills. The reason for these increasing/decreasing behaviours is straightforward: at the beginning, all the attack/defence parameters have been initialized to have location equal to 0. The user is free to change the location, and in the final package’s version he will also have the possibility to elicit different team-specific hyper-prior locations.

Model checking

Checking the model fit is a relevant and vital statistical task. To this purpose, we can evaluate hypothetical replications \(\mathcal{D}^{\text{rep}}\) under the posterior predictive distribution

\[p(\mathcal{D}^{\text{rep}}| \mathcal{D}) = \int p(\mathcal{D}^{\text{rep}}| \boldsymbol{\theta}) \pi(\boldsymbol{\theta}| \mathcal{D}) d\boldsymbol{\theta},\] and check whether these replicated values are somehow close to the observed data \(\mathcal{D}\). These methods comparing hypothetical replications with the observed data are named posterior predictive checks and have great theoretical and applied appeal in Bayesian inference. See Gelman et al. (2014) for an overview.

The function pp_foot allows to obtain:

## PP checks: aggregated goal's differences and ordered goal differences

pp_foot(
  object = fit1_stan, data = italy_2000,
  type = "aggregated"
)
#> $pp_plot
plot of chunk pp_foot

plot of chunk pp_foot

#> 
#> $pp_table
#>   goal diff. Bayesian p-value
#> 1         -3            0.326
#> 2         -2            0.870
#> 3         -1            0.988
#> 4          0            0.041
#> 5          1            0.265
#> 6          2            0.109
#> 7          3            0.924

pp_foot(
  object = fit1_stan, data = italy_2000,
  type = "matches"
)
#> $pp_plot
plot of chunk pp_foot

plot of chunk pp_foot

#> 
#> $pp_table
#>   1-alpha emp. coverage
#> 1    0.95         0.944

The aggregated goal difference frequencies seem to be decently captured by the model’s replications: in the first plot, blue horizontal lines denote the observed goal differences frequencies registered in the dataset, whereas yellow jittered points denote the correspondent replications. Goal-difference of 0, corresponding to the draws occurrences, is only slightly underestimated by the model. However, in general there are no particular clues of model’s misfit.

In the second plot, the ordered observed goal differences are plotted against their replications (50% and 95% credible intervals), and from this plot also we do not have particular signs of model’s misfits.

Other useful PP checks, such as the overlap between data density and replicated data densities to check eventual inconsistencies, can be obtained through the standard use of the bayesplot package, in this case providing an approximation to a continuous distribution using an input kernel choice (bw = 0.5 in the ppc_dens_overlay used below):

## PPC densities overlay with the bayesplot package

# extracting the replications
draws_raw <- fit1_stan$fit$draws()
draws <- posterior::as_draws_rvars(draws_raw)
sims <- list()
sims$y_rep <- posterior::draws_of(draws[["y_rep"]])

goal_diff <- italy_2000$home_goals - italy_2000$away_goals

# plotting data density vs replications densities

ppc_dens_overlay(goal_diff, sims$y_rep[, , 1] - sims$y_rep[, , 2], bw = 0.5) +
  theme_bw()
plot of chunk pp_checks

plot of chunk pp_checks

From this plot above we have the empirical confirmation that the goal difference is well captured by the static bivariate Poisson model.

Predictions and predictive accuracy

Posterior out-of-sample probabilities

The hottest feature in sports analytics is to obtain future predictions. By considering the posterior predictive distribution for future and observable data \(\tilde{\mathcal{D}}\), we acknowledge the whole model’s prediction uncertainty (which propagates from the posterior model’s uncertainty) and we can then generate observable values \(\tilde{D}\) conditioned on the posterior model’s parameters estimates:

\[p(\tilde{\mathcal{D}}| \mathcal{D}) = \int p(\tilde{\mathcal{D}}| \boldsymbol{\theta}) \pi(\boldsymbol{\theta}| \mathcal{D}) d\boldsymbol{\theta}.\]

We may then predict test set matches by using the argument predict of the stan_foot function, for instance considering the last four weeks of the 2000/2001 season as the test set, and then computing posterior-results probabilities using the function foot_prob for two teams belonging to the test set, such as Reggina Calcio and AC Milan:

### Fit Stan models
## weekly dynamics, predictions of last four weeks
## 2 chains 'n_iter' iterations each

fit4_stan <- stan_foot(
  data = italy_2000,
  model = "biv_pois",
  predict = 36,
  dynamic_type = "weekly",
  # parallel_chains = 2,
  chains = 2,
  iter_sampling = n_iter
) # biv poisson
foot_prob(
  object = fit4_stan, data = italy_2000,
  home_team = "Reggina Calcio",
  away_team = "AC Milan"
)
#> $prob_table
#>        home_team away_team prob_h prob_d prob_a         mlo
#> 1 Reggina Calcio  AC Milan  0.323  0.248   0.43 1-1 (0.101)
#> 
#> $prob_plot
plot of chunk foot_prob_weekly_predict

plot of chunk foot_prob_weekly_predict

Darker regions are associated with higher posterior probabilities, whereas the red square corresponds to the actual observed result, 2-1 for Reggina Calcio. According to the posterior-results probabilities, this final observed result had in principle about a 5% probability to happen! (Remember, football is about rare events…).

Home-win posterior probabilities

We can also use the out-of-sample posterior-results probabilities to compute some aggregated home/draw/loss probabilities (based then on the \(S\) draws from the MCMC method) for a given match:

\[\begin{align} p_{\text{home}}= &\ \textrm{Pr}(X>Y) = \frac{1}{S}\sum_{s=1}^S| \tilde{x}^{(s)}> \tilde{y}^{(s)}|\\ p_{\text{draw}} = &\ \textrm{Pr}(X=Y) = \frac{1}{S}\sum_{s=1}^S| \tilde{x}^{(s)}= \tilde{y}^{(s)}|\\ p_{\text{loss}} = &\ \textrm{Pr} (X<Y)=\frac{1}{S}\sum_{s=1}^S| \tilde{x}^{(s)}< \tilde{y}^{(s)}|, \end{align}\]

where \((\tilde{x}^{(s)}, \tilde{y}^{(s)})\) represents the \(s\)-th MCMC pair of the future home goals and away goals for a given match, respectively. According to this scenario, we can depict the home-win posterior probabilities of a given test set through the function foot_round_robin:

## Home win out-of-sample probabilities

foot_round_robin(object = fit4_stan, data = italy_2000)
#> $round_table
#>              Home           Away Home_prob Observed
#> 1   Hellas Verona     Bologna FC     0.338        -
#> 2           Inter     Bologna FC     0.438        -
#> 3  Udinese Calcio     SSC Napoli     0.495        -
#> 4  ACF Fiorentina     SSC Napoli     0.535        -
#> 5        US Lecce     Lazio Roma     0.235        -
#> 6           Inter     Lazio Roma     0.295        -
#> 7  Brescia Calcio        AS Bari     0.627        -
#> 8  Reggina Calcio        AS Bari     0.590        -
#> 9  Udinese Calcio Vicenza Calcio     0.392        -
#> 10 Brescia Calcio Vicenza Calcio     0.510        -
#> 11        AS Roma       Parma AC     0.392        -
#> 12       US Lecce       Parma AC     0.220        -
#> 13        AS Roma       AC Milan     0.510        -
#> 14 Reggina Calcio       AC Milan     0.318        -
#> 15  Hellas Verona     AC Perugia     0.365        -
#> 16       Juventus     AC Perugia     0.568        -
#> 17 ACF Fiorentina       Atalanta     0.425        -
#> 18       Juventus       Atalanta     0.527        -
#> 19     SSC Napoli        AS Roma     0.230        -
#> 20        AS Bari        AS Roma     0.255        -
#> 21     Lazio Roma Udinese Calcio     0.728        -
#> 22       Atalanta Udinese Calcio     0.555        -
#> 23     Bologna FC       US Lecce     0.557        -
#> 24 Vicenza Calcio       US Lecce     0.530        -
#> 25       AC Milan Brescia Calcio     0.510        -
#> 26     AC Perugia Brescia Calcio     0.390        -
#> 27     SSC Napoli  Hellas Verona     0.488        -
#> 28       Parma AC  Hellas Verona     0.640        -
#> 29     AC Perugia Reggina Calcio     0.525        -
#> 30       Atalanta Reggina Calcio     0.438        -
#> 31     Lazio Roma ACF Fiorentina     0.637        -
#> 32       AC Milan ACF Fiorentina     0.632        -
#> 33     Bologna FC       Juventus     0.320        -
#> 34 Vicenza Calcio       Juventus     0.275        -
#> 35        AS Bari          Inter     0.288        -
#> 36       Parma AC          Inter     0.620        -
#> 
#> $round_plot
plot of chunk foot_roundrobin

plot of chunk foot_roundrobin

Red cells denote more likely home-wins (close to 0.6), such as: Lazio Roma - Fiorentina (observed result: 3-0, home win), Lazio Roma - Udinese (observed result: 3-1, home win), Juventus - AC Perugia (observed result: 1-0, home win), Brescia Calcio - AS Bari (observed result: 3-1, home win). Conversely, lighter cells denote more likely away wins (close to 0.6), such as: AS Bari - AS Roma (observed result: 1-4, away win), AS Bari - Inter (observed result: 1-2, away win).

Rank-league reconstruction

Statisticians and football amateurs are much interested in the final rank-league predictions. However, predicting the final rank position (along with the teams’ points) is often assimilated to an oracle, rather than a coherent statistical procedure.

We can provide here:

## Rank league reconstruction

# aggregated plot

foot_rank(object = fit1_stan, data = italy_2000)
plot of chunk rank_pred1

plot of chunk rank_pred1


# team-specific plot

foot_rank(
  object = fit1_stan, data = italy_2000,
  visualize = "individual"
)
plot of chunk rank_pred1

plot of chunk rank_pred1

## Rank predictions for individual teams

# aggregated plot

foot_rank(object = fit4_stan, data = italy_2000)
plot of chunk rank_pred2

plot of chunk rank_pred2


# team-specific plot

foot_rank(
  object = fit4_stan, data = italy_2000,
  teams = c("AC Milan", "AS Roma"),
  visualize = "individual"
)
plot of chunk rank_pred2

plot of chunk rank_pred2


foot_rank(
  object = fit4_stan, data = italy_2000,
  visualize = "individual"
)
plot of chunk rank_pred2

plot of chunk rank_pred2

Model comparisons

Comparing Football Prediction Models using compare_foot

Evaluating the performance of these models is crucial to understand their predictive power and reliability. The compare_foot function provides a comprehensive way to compare different models or probability matrices using metrics like accuracy, Brier score, ranked probability score (RPS), Pseudo \(R^2\), and average coverage probability (ACP). It also offers the option to compute confusion matrices for a detailed performance analysis.

italy_2020_2021_fit <- italy %>%
  dplyr::select(Season, home, visitor, hgoal, vgoal) %>%
  dplyr::filter(Season == "2020" | Season == "2021") %>%
  dplyr::mutate(periods = dplyr::case_when(
    dplyr::row_number() <= 190 ~ 1,
    dplyr::row_number() <= 380 ~ 2,
    dplyr::row_number() <= 570 ~ 3,
    TRUE ~ 4
  )) %>% # Assign periods based on match number
  dplyr::select(periods,
    home_team = home,
    away_team = visitor, home_goals = hgoal, away_goals = vgoal
  )


fit_comp_1 <- stan_foot(
  data = italy_2020_2021_fit,
  model = "biv_pois",
  home_effect = TRUE,
  predict = 190,
  dynamic_type = "season",
  # parallel_chains = 4,
  iter_sampling = n_iter
)

fit_comp_2 <- stan_foot(
  data = italy_2020_2021_fit,
  model = "double_pois",
  home_effect = TRUE,
  predict = 190,
  dynamic_type = "season",
  # parallel_chains = 4,
  iter_sampling = n_iter
)


italy_2020_2021_test <- italy %>%
  dplyr::select(Season, home, visitor, hgoal, vgoal) %>%
  dplyr::filter(Season == "2014" | Season == "2015") %>%
  dplyr::mutate(periods = dplyr::case_when(
    dplyr::row_number() <= 190 ~ 1,
    dplyr::row_number() <= 380 ~ 2,
    dplyr::row_number() <= 570 ~ 3,
    TRUE ~ 4
  )) %>%
  dplyr::filter(dplyr::row_number() > 570) %>%
  dplyr::select(periods,
    home_team = home,
    away_team = visitor,
    home_goals = hgoal,
    away_goals = vgoal
  )
compare_results_models <- compare_foot(
  source = list(
    biv_pois = fit_comp_1,
    double_pois = fit_comp_2
  ),
  test_data = italy_2020_2021_test,
  metric = c("accuracy", "brier", "ACP", "pseudoR2", "RPS"),
  conf_matrix = TRUE
)


print(compare_results_models, digits = 3)
#> Predictive Performance Metrics
#>        Model   RPS accuracy brier pseudoR2   ACP
#>     biv_pois 0.255    0.353 0.716    0.311 0.350
#>  double_pois 0.256    0.368 0.715    0.309 0.361
#> 
#> Confusion Matrices
#> Model: biv_pois 
#> 
#>           Actual
#> Predicted  Home Win Draw Away Win
#>   Home Win       51   28       32
#>   Draw            0    0        0
#>   Away Win       38   25       16
#> 
#> Model: double_pois 
#> 
#>           Actual
#> Predicted  Home Win Draw Away Win
#>   Home Win       57   33       35
#>   Draw            0    0        0
#>   Away Win       32   20       13

LOOIC and WAIC

Comparing statistical models in terms of some predictive information criteria should conclude many analysis and can be carried out by using the Leave-one-out cross-validation criterion (LOOIC) and the Watanabe Akaike Information criterion (WAIC) performed by using the loo package. For more details about LOOIC and WAIC, read the paper Vehtari, Gelman, and Gabry (2017).

The general formulation for the predictive information criteria is the following:

\[ \text{crit}=-2 \widehat{\text{elpd}} = -2 (\widehat{\text{lpd}}- \text{parameters penalty})\]

The interpretation is the following: the lower is the value for an information criterion, and the better is the estimated model’s predictive accuracy. Moreover, if two competing models share the same value for the log predictive density, the model with less parameters is favored.

This is the Occam’s Razor occurring in statistics:

“Frustra fit per plura quod potest fieri per pauciora”

We can perform Bayesian model comparisons using the $loo() method, which computes an approximate LOO-CV using the loo package. To use this method, you must compute and store the pointwise log-likelihood in your Stan program. We will compare the static and weekly dynamic models for the 2000/2001 Italian Serie A:

### Model comparisons
## LOOIC, loo function

# compute loo

loo1 <- fit1_stan$fit$loo()
loo1_t <- fit1_stan_t$fit$loo()
loo2 <- fit2_stan$fit$loo()
loo3 <- fit3_stan$fit$loo()
loo3_t <- fit3_stan_t$fit$loo()


# compare three looic

loo_compare(loo1, loo1_t, loo2, loo3, loo3_t)
#>        elpd_diff se_diff
#> model1  0.0       0.0   
#> model2 -0.6       0.4   
#> model3 -1.2       2.5   
#> model4 -5.7       4.0   
#> model5 -6.4       4.1

According to the above model LOOIC comparisons, the weekly-dynamic double Poisson models attain the lowest LOOIC values and are then the favored models in terms of predictive accuracy. The static model’s fit1_stan final looic is suggesting that the assumption of static team-specific parameters is too restrictive and oversimplified to capture teams’ skills over time and make reliable predictions. Anyway, from model checking we have the suggestion that even the static model has a reliable goodness of fit and could be used for some simplified analysis not requiring complex dynamic patterns.

Extensions in next versions

Extensions and to-do list for the next package’s versions:

  1. Data Web-scraping: automatic routine to scrape data from internet;

  2. More numerical outputs: posterior probabilities, betting strategies, etc.;

  3. Diagnostics, pp checks designed for football;

  4. Teams’ statistics

  5. More covariates to be included in the model (possibly by users).

  6. More priors choices

References

Baio, Gianluca, and Marta Blangiardo. 2010. “Bayesian Hierarchical Model for the Prediction of Football Results.” Journal of Applied Statistics 37 (2): 253–64.
Betancourt, Michael. 2017. “A Conceptual Introduction to Hamiltonian Monte Carlo.” arXiv Preprint arXiv:1701.02434.
Bradley, Ralph Allan, and Milton E. Terry. 1952. “Rank Analysis of Incomplete Block Designs: I. The Method of Paired Comparisons.” Biometrika 39 (3/4): 324–45.
Davidson, Roger R. 1970. “On Extending the Bradley-Terry Model to Accommodate Ties in Paired Comparison Experiments.” Journal of the American Statistical Association 65 (329): 317–28.
Davidson, Roger R., and Robert J. Beaver. 1977. “On Extending the Bradley-Terry Model to Incorporate Within-Pair Order Effects.” Biometrics 33 (4): 693–702. http://www.jstor.org/stable/2529467.
Egidi, Leonardo, Francesco Pauli, and Nicola Torelli. 2018. “Combining Historical Data and Bookmakers’ Odds in Modelling Football Scores.” Statistical Modelling 18 (5-6): 436–59.
Egidi, Leonardo, and Nicola Torelli. 2020. “Comparing Goal-Based and Result-Based Approaches in Modelling Football Outcomes.” Social Indicators Research, 1–13.
Gabry, Jonah, Rok Češnovar, Andrew Johnson, and Steve Bronder. 2024. Cmdstanr: R Interface to ’CmdStan’. https://mc-stan.org/cmdstanr/.
Gelman, Andrew. 2014. “Stan Goes to the World Cup.” Statistical Modeling, Causal Inference, and Social Science Blog. https://statmodeling.stat.columbia.edu/2014/07/13/stan-analyzes-world-cup-data/.
Gelman, Andrew, John B Carlin, Hal S Stern, and Donald B Rubin. 2014. Bayesian Data Analysis. Vol. 2. Chapman & Hall/CRC Boca Raton, FL, USA.
Groll, Andreas, Gunther Schauberger, and Gerhard Tutz. 2015. “Prediction of Major International Soccer Tournaments Based on Team-Specific Regularized Poisson Regression: An Application to the FIFA World Cup 2014.” Journal of Quantitative Analysis in Sports 11 (2): 97–115.
Issa Mattos, David, and Érika Martins Silva Ramos. 2022. “Bayesian Paired Comparison with the Bpcs Package.” Behavior Research Methods 54 (4): 2025–45. https://doi.org/10.3758/s13428-021-01714-2.
Karlis, Dimitris, and Ioannis Ntzoufras. 2003. “Analysis of Sports Data by Using Bivariate Poisson Models.” Journal of the Royal Statistical Society: Series D (The Statistician) 52 (3): 381–93.
———. 2009. “Bayesian Modelling of Football Outcomes: Using the Skellam’s Distribution for the Goal Difference.” IMA Journal of Management Mathematics 20 (2): 133–45.
Koopman, Siem Jan, and Rutger Lit. 2015. “A Dynamic Bivariate Poisson Model for Analysing and Forecasting Match Results in the English Premier League.” Journal of the Royal Statistical Society: Series A (Statistics in Society) 178 (1): 167–86.
Kucukelbir, Alp, Dustin Tran, Rajesh Ranganath, Andrew Gelman, and David M Blei. 2017. “Automatic Differentiation Variational Inference.” Journal of Machine Learning Research 18 (14): 1–45.
Macrı̀ Demartino, Roberto, Leonardo Egidi, and Nicola Torelli. 2024. “Alternative Ranking Measures to Predict International Football Results.” Computational Statistics, 1–19.
Owen, Alun. 2011. “Dynamic Bayesian Forecasting Models of Football Match Outcomes with Estimation of the Evolution Variance Parameter.” IMA Journal of Management Mathematics 22 (2): 99–113.
Reep, C., R. Pollard, and B. Benjamin. 1971. “Skill and Chance in Ball Games.” Journal of the Royal Statistical Society. Series A (General) 134 (4): 623–29. http://www.jstor.org/stable/2343657.
Robert, Christian, and George Casella. 2013. Monte Carlo Statistical Methods. Springer Science & Business Media.
Stan Development Team. 2020. RStan: The R Interface to Stan.” https://mc-stan.org/.
Vehtari, Aki, Andrew Gelman, and Jonah Gabry. 2017. “Practical Bayesian Model Evaluation Using Leave-One-Out Cross-Validation and WAIC.” Statistics and Computing 27 (5): 1413–32.
Whelan, John T. 2017. “Prior Distributions for the Bradley-Terry Model of Paired Comparisons.” arXiv Preprint arXiv:1712.05311.
Zhang, Lu, Bob Carpenter, Andrew Gelman, and Aki Vehtari. 2022. “Pathfinder: Parallel Quasi-Newton Variational Inference.” Journal of Machine Learning Research 23 (306): 1–49. http://jmlr.org/papers/v23/21-0889.html.