2 Defining the Model

In soccer, the goals scored by a single team can be thought of as coming from a Poisson distribution. Thus, we can state that the two scores from a given game, \(X\) and \(Y\), are Poisson distributed.

\[\begin{equation} \begin{split} X & \sim Poisson(\lambda_1)\\ Y & \sim Poisson(\lambda_2) \end{split} \tag{2.1} \end{equation}\]

This parameterization, however, assumes that the scores \(X\) and \(Y\) are independent of each other. In this project, the aim is to model the association between the two Poisson distributed variables. There are two methods for modeling this association that will be examined. The first is a bivariate Poisson distribution. The second is a mixed effects model with a random slope for each game.

2.1 The bivariate Poisson model

In the bivariate Poisson, the scores for teams \(A\) and \(B\), \(X_A\) and \(X_B\), are random variables where \(G_i \sim Poisson(\lambda_i),\ i = 0,\ 1,\ 2\).

\[\begin{equation} \begin{split} X_A & = G_1 + G_0\\ X_B & = G_2 + G_0 \end{split} \tag{2.2} \end{equation}\]

And \(X_A\) and \(X_B\) are jointly distributed

\[\begin{equation} (X_A,\ X_B) \sim BP(\lambda_1,\ \lambda_2,\ \lambda_0) \tag{2.3} \end{equation}\]

In this parameterization, \(X_A\) and \(X_B\) are Poisson distributed with means equal to \(\lambda_1 + \lambda_3\) and \(\lambda_2 + \lambda_3\) respectively, with \(\lambda_3\) representing the covariance between \(X_A\) and \(X_B\) (AlMuhayfith, Alzaid, & Omair, 2016; Griffiths & Milne, 1978; Kawamura, 1973). We can model these parameters just as we would model, for example, means in a normal distribution. Thus, for a given game, \(i\),

\[\begin{equation} \begin{split} (X_{Ai},\ X_{Bi}) & \sim BP(\lambda_{1i},\ \lambda_{2i},\ \lambda_{0i}),\\ log(\lambda_{1i}) & = \omega_{1i}\beta_{1},\\ log(\lambda_{2i}) & = \omega_{2i}\beta_{2},\\ log(\lambda_{0i}) & = \omega_{0i}\beta_{0} \end{split} \tag{2.4} \end{equation}\]

where \(\omega\) represents a matrix of independent variable, and \(\beta\) denotes the regression coefficients (Karlis & Ntzoufras, 2003, 2005). When predicting soccer games, the independent variables are the teams that are playing, and the regression coefficients represent the offensive or defensive strength of the teams (Groll, Kneib, Mayr, & Schauberger, 2016). Specifically, we can model the two scores, \(X\), for teams \(A\) and \(B\), from a given game, \(i\), as

\[\begin{align} log(X_{Ai}) &= \lambda_{1i} + \lambda_{0i}, \notag \\ log(X_{Bi}) &= \lambda_{2i} + \lambda_{0i}, \notag \\ \lambda_{1i} &= \mu + \eta H_i + \alpha_A + \delta_B, \tag{2.5} \\ \lambda_{2i} &= \mu + \alpha_B + \delta_A, \tag{2.6} \\ \lambda_{0i} &= \rho_A + \rho_B \tag{2.7} \end{align}\]

Here, \(\mu\) denotes the overall intercept, or the expected log goals for a team not playing at home, and \(\eta\) represents the increase in expected log goals for a team playing at home. \(H_i\) is a dummy variable indicating whether game \(i\) was played at the home team’s stadium (1) or a neutral site (0). The estimates of team ability come from \(\alpha\) and \(\delta\), which represent the attacking and defensive abilities of the given team respectively. These can be modeled as fixed effects, or as random effects. Finally, \(\rho\) denotes the change in expected covariance for each team (Whitaker, 2011).

2.1.1 Implementing the bivariate Poisson model

The bivariate Poisson model can be fit using the following Stan code and the rstan package (Guo, Gabry, & Goodrich, 2017). In the model code, I have modeled \(\alpha\), \(\delta\), and \(\rho\) as random effects so that \(\mu\) represents the overall mean. This means that positive \(\alpha\) values and negative \(\delta\) are good, as team wants the attack to add goals above the average, and the defense to result in the opponent have below average goals.

All parameters have non-information priors. The priors were specified as normal with a mean of 0 and standard deviation of 10. This specification provides a diffuse range of plausible values for the parameter, allowing the likelihood to dominate. However, the use of a diffuse normal prior also prevents arbitrary boundaries from being set for a uniformly distributed prior.

I have also reparameterized the random effects so that Stan can sample from a \(\mathcal{N}(0,\ 1)\), which reduces computation time and increases the efficiency of the sampler to avoid divergent transitions (Betancourt, 2016, 2017; Stan Development Team, 2016c).

data {
  int<lower=1> num_clubs;                           // number of clubs
  int<lower=1> num_games;                           // number of games
  int<lower=1,upper=num_clubs> home[num_games];     // home club for game g
  int<lower=1,upper=num_clubs> away[num_games];     // away club for game g
  int<lower=0> h_goals[num_games];                  // home goals for game g
  int<lower=0> a_goals[num_games];                  // away goals for game g
  int<lower=0,upper=1> homeg[num_games];            // home field for game g
}
parameters {
  vector[num_clubs] raw_alpha;                  // attacking intercepts
  vector[num_clubs] raw_delta;                  // defending intercepts
  vector[num_clubs] raw_rho;                    // covariance intercepts

  real mu;                                          // fixed intercept
  real eta;                                         // homefield
  real<lower=0> sigma_a;                            // attacking sd
  real<lower=0> sigma_d;                            // defending sd
  real<lower=0> sigma_r;                            // covariance sd
}
transformed parameters {
  vector[num_clubs] alpha;
  vector[num_clubs] delta;
  vector[num_clubs] rho;
  
  alpha = raw_alpha * sigma_a;
  delta = raw_delta * sigma_d;
  rho = raw_rho * sigma_r;
}
model {
  vector[num_games] lambda1;
  vector[num_games] lambda2;
  vector[num_games] lambda3;

  // priors
  raw_alpha ~ normal(0, 1);
  raw_delta ~ normal(0, 1);
  raw_rho ~ normal(0, 1);
  mu ~ normal(0, 10);
  eta ~ normal(0, 10);
  sigma_a ~ normal(0, 10);
  sigma_d ~ normal(0, 10);
  sigma_r ~ normal(0, 10);

  // likelihood
  for (g in 1:num_games) {
    lambda1[g] = exp(mu + (eta * homeg[g]) + alpha[home[g]] + delta[away[g]]);
    lambda2[g] = exp(mu + alpha[away[g]] + delta[home[g]]);
    lambda3[g] = exp(rho[home[g]] + rho[away[g]]);
  }
  h_goals ~ poisson(lambda1 + lambda3);
  a_goals ~ poisson(lambda2 + lambda3);
}

2.2 The game random intercept model

As an alternative to the bivariate Poisson model, one could model a random intercept for each game, rather than estimating \(\rho\). Thus, the game random intercept model would be defined as

\[\begin{align} log(X_{Ai}) &= \lambda_{1i}, \notag \\ log(X_{Bi}) &= \lambda_{2i}, \notag \\ \lambda_{1i} &= \mu + \eta H_i + \alpha_A + \delta_B + \gamma_i, \tag{2.8} \\ \lambda_{2i} &= \mu + \alpha_B + \delta_A + \gamma_i \tag{2.9} \end{align}\]

This model is very similar to the bivariate Poisson. The two rate parameters, \(\lambda_{1i}\) and \(\lambda_{2i}\), are defined the same, with only the addition of \(\gamma_i\) denoting the random intercept for the game. This \(\gamma_i\) replaces \(\lambda_{0i}\) in the bivariate Poisson model. This has a couple of downstream effects on the estimation.

First, in the bivariate Poisson model, \(\rho\) is estimated for each team. Thus the convariance, \(\lambda_{0i}\) is predicted for each game by the competing teams’ \(\rho\) values. This also allows predictions to be made for future games about what the covariance or dependency between the teams will be. In contrast, game random intercept model doesn’t estimate predictors for this dependency. In this model, the dependency is treated as a random variable, with some variance to be estimated.

Thus, although both models take into account the dependency between the two scores in a given game, the models make different assumptions about the nature of this dependency.

2.2.1 Implementing the game random intercept model

The game random intercept model can be estimated using the following Stan code and the rstan package (Guo et al., 2017). As with the bivariate Poisson model, I have modeled \(\alpha\) and \(\delta\) as random effects so that \(\mu\) represents the overall mean. Diffuse normal priors are specified in the same way as for the bivariate Poisson model. The random effects are also reparameterized in the same way as they were in the bivariate Poisson to reduce computation time and increase efficiency (Betancourt, 2016, 2017; Stan Development Team, 2016c).

data {
  int<lower=1> num_clubs;                           // number of clubs
  int<lower=1> num_games;                           // number of games
  int<lower=1,upper=num_clubs> home[num_games];     // home club for game g
  int<lower=1,upper=num_clubs> away[num_games];     // away club for game g
  int<lower=0> h_goals[num_games];                  // home goals for game g
  int<lower=0> a_goals[num_games];                  // away goals for game g
  int<lower=0,upper=1> homeg[num_games];            // home field for game g
}
parameters {
  vector[num_clubs] raw_alpha;                      // attacking intercepts
  vector[num_clubs] raw_delta;                      // defending intercepts
  vector[num_games] raw_gamma;                      // game intercepts

  real mu;                                          // fixed intercept
  real eta;                                         // homefield
  real<lower=0> sigma_a;                            // attacking sd
  real<lower=0> sigma_d;                            // defending sd
  real<lower=0> sigma_g;                            // game sd
}
transformed parameters {
  vector[num_clubs] alpha;
  vector[num_clubs] delta;
  vector[num_games] gamma;

  alpha = sigma_a * raw_alpha;
  delta = sigma_d * raw_delta;
  gamma = sigma_g * raw_gamma;
}
model {
  vector[num_games] lambda1;
  vector[num_games] lambda2;

  // priors
  raw_alpha ~ normal(0, 1);                         // attacking random effects
  raw_delta ~ normal(0, 1);                         // defending random effects
  raw_gamma ~ normal(0, 1);                         // game random effects
  mu ~ normal(0, 10);
  eta ~ normal(0, 10);
  sigma_a ~ normal(0, 10);
  sigma_d ~ normal(0, 10);
  sigma_g ~ normal(0, 10);

  // likelihood
  for (g in 1:num_games) {
    lambda1[g] = exp(mu + (eta * homeg[g]) + alpha[home[g]] + delta[away[g]] + gamma[g]);
    lambda2[g] = exp(mu + alpha[away[g]] + delta[home[g]] + gamma[g]);
  }
  h_goals ~ poisson(lambda1);
  a_goals ~ poisson(lambda2);
}