From Idea to Insight: Using Bayesian Hierarchical Models to Predict Game Outcomes Part 1 From Idea to Insight: Using Bayesian Hierarchical Models to Predict Game Outcomes Part 1
From Idea to Insight: Using Bayesian Hierarchical Models to Predict Game Outcomes Part 1. Imagine you’re a data scientist at an... From Idea to Insight: Using Bayesian Hierarchical Models to Predict Game Outcomes Part 1

From Idea to Insight: Using Bayesian Hierarchical Models to Predict Game Outcomes Part 1.

Imagine you’re a data scientist at an online mobile multiplayer competition platform. Your bosses have a vested interest in paying people with our skillset to predict game outcomes for a variety of commercial applications they profit from, for example, setting odds and sharing better insights with game developers on who wins and why could enable better user engagement through better games by way of better matches. 

[Related Article: Hierarchical Bayesian Models in R]

Your boss tells you that you have to write a function that returns the probability player 1 will win game g against player 2, given historical outcomes of prior matchups. The function needs to take three arguments: (i) & (ii) are skill levels of players 1 and 2 respectively and (iii) the outcome history between them. The function needs to run even when neither player appears in the given match history.

In this tutorial, run through the entire workflow of using bayesian hierarchical models to predict game outcomes, from EDA to the model build and diagnosis. 

First up, data.  

Data for 

  • Health Check
    • Missingness
    • Suspect Data/Outliers
    • Any winner IDs not in player col?
  • EDA
    • Spread Distribution
    • Class imbalance?

With NA’s in features: player1_skill, player2_skill, and winner_id, I can either remove them or impute the values. I’ll impute them to preserve more information for the model. I’ve written a customer imputation framework that will:

  1. Impute NAs in existing rows and/or
  2. Add rows that represent observations that should be in your data but aren’t (For example, consider you know there are specific players who ought to be in your data but for some erroneous reason they’re not). Given the list of players that should be in the list, this custom suite of functions will return imputed values for all target features for the missing player ID’s.

It can be run automatically for any target feature. It’s fast, flexible, and automated, enabling use in any ETL pipeline when creating other features to ensure consistency.

impute_missing <- function(have_these_data, need_these_ids, 

                           locate_missing_in, vars_to_impute, 

                           conserve_vars = NULL){

  # Replace nas in have_these_data with mean. Then add new observations every id 

  #   in need_these_ids that's not in 

  #   have_these_data with a variable equal to the mean 

  #   (if it's not in conserve_vars) or min (if it is) for every variable in 

  #   vars_to_impute. Output is nrow(have_these_data) + nrow(# of missing ids)  

  # Make imputation dfs

  imp_table_list <- mk_imputation_df(vars_to_impute = vars_to_impute, 

                                     from_df = have_these_data, 

                                     conserve_vars = conserve_vars) 


  have_these_ids <- have_these_data[, locate_missing_in] %>% unique()

  missing_sum <- locate_missing_ids(need_these_ids, have_these_ids)

  missing_ids <- missing_sum$missing_ids$all_missing

  n_missing <- ul(missing_ids)

  map2_dfc(.x = imp_table_list, 

           .y = c(T, F), 

           .f = function(df, num, n_missing) {


               mk_padding(df, numerics = num, n_missing = n_missing)



           n_missing) %>% 

    select(-imputed1) %>% 

    mutate(!! quo_name(locate_missing_in) := missing_ids) %>%  

    select(!! quo_name(locate_missing_in), imputed, everything()) %>% 

    mutate_if(is.factor, .funs = as.character) %>%  

    ungroup() %>%  

    as_tibble() -> padding


  # ensure data types are consistent between both dfs

  have_these_data <- ungroup(have_these_data)

  same_cols <- intersect(names(have_these_data), names(padding))

  select(padding, one_of(same_cols)) -> padding

  select(have_these_data, one_of(same_cols)) -> have_these_data


  diff <- which(map2_lgl(map_chr(.x = have_these_data, .f = typeof), 

                 map_chr(.x =  padding, .f = typeof), identical) == F)




  # Get pairwise types 

  x <- have_these_data[,diff]

  y <- padding[,diff]

  x_t <- map_chr(x, typeof)

  y_t <- map_chr(y, typeof)

  x <- map2_dfc(x,


                .f = function(v, t){

                  if(t == "integer"){



                  if(t == "double"){





  # Convert padding to match type in have_these_data 

  have_these_data[,diff] <- x

  padding[,diff] <- y



  have_these_data %>% 

    ungroup() %>% 

    mutate(imputed = 0) %>%

    mutate_if(is.factor, .funs = as.character) %>% 

    select(imputed, names(padding)) %>%  #list(padding) %>% map(glimpse)

    union_all(padding) -> padded  

# now replace nas in every var in padded data 

  out <- zoo::na.aggregate(select(padded, -id)) %>%

    cbind(select(padded, id)) %>%

    select(locate_missing_in, everything())



I remove the two matches without a winner ID because imputing an ID would produce an ID that isn’t either of the two player IDs.

A cursory glance reveals nothing suspiciously large, small, or otherwise off about the numeric values.

Are there any winner IDs not in either player col?

There are 6 players who played each other but neither of whom appears in the winner ID column.

Data Bill of Health—Summary

  • 38 NA values of skill level across 78 unique players were imputed (replaced with feature mean).
  • 2 observations were removed because they didn’t have winner IDs.
  • No obvious outliers
  • No suspicious observations


In this exploratory section, I investigate how the spread of skill level is distribution. I also answer:

  • Do players play more than once?
  • Do players play each other more than once?
  • Do skill levels evolve in my data?

Players can play each other more than once, and a minority of players see their skill levels improve over time.


[Related Article: The Best Open Source Research at DeepMind in 2019 So Far]

With the grunt EDA work out of the way for using bayesian hierarchical models, let’s move into exercising the predictive modeling skills that put bread on the table. In the next article, I’ll select an appropriate model by comparing a naive approach to a more robust Bayesian hierarchical model, perform ROC analysis, and bundle all that into a function that predicts game outcomes given the player history.

Brandon Dey, ODSC

Brandon is a Consulting Data Scientist at Avanade, the joint venture between Microsoft and Accenture, in Portland, Oregon. Outside of work, he wonders if he's actually fooling anyone by referring to himself in the third person. Plus, he likes backpacking, long distance trail running, aerial photography, writing creative non-fiction, and attempting to write short stories with characters suspiciously similar to himself...