View on GitHub

League of Legends: Match Outcome Predictor

Predicting the outcome of a League match using Machine Learning

Final Project for EECS 398 Practical Data Science

Names: Anthony Sun, Wenjia Lu

Introduction

In the high-stakes world of professional League of Legends, every second and every decision can tip the balance between victory and defeat. While matches officially conclude when a team’s Nexus is destroyed, competitive outcomes are often shaped much earlier through resource advantages and map control. Three in-game metrics — creep score (CS), gold, map control — consistently stand out as key indicators of a team’s advantage over the game.

This analysis focuses on determining whether it’s possible to predict the final outcome of a match using these performance metrics.

Our dataset of 115152 rows comes from Oracle’s Elixir, and contains data on individual player statistics and teamwide statistics in 9596 professional games. We will analyze our data using the teamwide statistics, a total of 19,192 rows of the dataset. The relevant columns we use are as follows:

result: outcome of a match; ‘0’ indicates defeat, ‘1’ indicates victory

gameid: unique identifier for each game

side: either ‘Red’ or ‘Blue’, indicating which side the team starts on in the game; we have later numerically encoded this to 0 (Blue) and 1 (Red)

firstherald: whether the team has secured the first herald; ‘1’ indicates true, ‘0’ indicates false

heralds: number of heralds the team has slain

firstdragon: whether the team has secured the first dragon; ‘1’ indicates true, ‘0’ indicates false

firstblood: whether the team was the first to slay an enemy player; ‘1’ indicates true, ‘0’ indicates false

goldatX: total gold earned for the team at the X-minute mark; indicative of strength in terms of economy, for X = 10, 15, 20

golddiffatX: difference in total gold earned between teams at the X-minute mark; indicative of strength in terms of economy, for X = 10, 15, 20

xpatX: total experience points earned for the team at the X-minute mark; indicative of strength in terms of levels, for X = 10, 15, 20

xpdiffatX: difference in experience points earned between teams at the X-minute mark; indicative of strength in terms of levels, for X = 10, 15, 20

csatX: total creep score (CS) for the team at the X-minute mark; contributes to overall gold and XP advantage, for X = 10, 15, 20

csdiffatX: difference in creep score (CS) between teams at the X-minute mark; contributes to overall gold and XP advantage, for X = 10, 15, 20

killsatX: number of kills by the team at the X-minute mark, for X = 10, 15, 20

assistsatX: number of assists by team at the X-minute mark, for X = 10, 15, 20

deathsatX: number of deaths by team at the X-minute mark, for X = 10, 15, 20

Here’s a preview of our data:

gameid datacompleteness url league year split playoffs date game patch participantid side position playername playerid teamname teamid champion ban1 ban2 ban3 ban4 ban5 pick1 pick2 pick3 pick4 pick5 gamelength result kills deaths assists teamkills teamdeaths doublekills triplekills quadrakills pentakills firstblood firstbloodkill firstbloodassist firstbloodvictim team kpm ckpm firstdragon dragons opp_dragons elementaldrakes opp_elementaldrakes infernals mountains clouds oceans chemtechs hextechs dragons (type unknown) elders opp_elders firstherald heralds opp_heralds void_grubs opp_void_grubs firstbaron barons opp_barons firsttower towers opp_towers firstmidtower firsttothreetowers turretplates opp_turretplates inhibitors opp_inhibitors damagetochampions dpm damageshare damagetakenperminute damagemitigatedperminute wardsplaced wpm wardskilled wcpm controlwardsbought visionscore vspm totalgold earnedgold earned gpm earnedgoldshare goldspent gspd gpr total cs minionkills monsterkills monsterkillsownjungle monsterkillsenemyjungle cspm goldat10 xpat10 csat10 opp_goldat10 opp_xpat10 opp_csat10 golddiffat10 xpdiffat10 csdiffat10 killsat10 assistsat10 deathsat10 opp_killsat10 opp_assistsat10 opp_deathsat10 goldat15 xpat15 csat15 opp_goldat15 opp_xpat15 opp_csat15 golddiffat15 xpdiffat15 csdiffat15 killsat15 assistsat15 deathsat15 opp_killsat15 opp_assistsat15 opp_deathsat15 goldat20 xpat20 csat20 opp_goldat20 opp_xpat20 opp_csat20 golddiffat20 xpdiffat20 csdiffat20 killsat20 assistsat20 deathsat20 opp_killsat20 opp_assistsat20 opp_deathsat20 goldat25 xpat25 csat25 opp_goldat25 opp_xpat25 opp_csat25 golddiffat25 xpdiffat25 csdiffat25 killsat25 assistsat25 deathsat25 opp_killsat25 opp_assistsat25 opp_deathsat25
ESPORTSTMNT01_2690210 complete nan LCKC 2022 Spring 0 2022-01-10 07:44:08 1 12.01 1 Blue top Soboro oe:player:38e0af7278d6769d0c81d7c4b47ac1e BRION Challengers oe:team:733ebb9dbf22a401c0127a0c80193ca Renekton Karma Caitlyn Syndra Thresh Lulu nan nan nan nan nan 1713 0 2 3 2 9 19 0 0 0 0 0 0 0 0 0.3152 0.9807 nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan 0 0 nan nan nan nan nan nan nan 0 0 15768 552.294 0.278784 1072.4 777.793 8 0.2802 6 0.2102 5 26 0.9107 10934 7164 250.928 0.253859 10275 nan nan 231 220 11 nan nan 8.0911 3228 4909 89 3176 4953 81 52 -44 8 0 0 0 0 0 0 5025 7560 135 4634 7215 121 391 345 14 0 1 0 0 1 0 6506 9853 172 6338 10200 171 168 -347 1 0 1 1 0 2 0 8462 11754 212 7857 12279 203 605 -525 9 0 1 1 0 2 0
ESPORTSTMNT01_2690210 complete nan LCKC 2022 Spring 0 2022-01-10 07:44:08 1 12.01 2 Blue jng Raptor oe:player:637ed20b1e41be1c51bd1a4cb211357 BRION Challengers oe:team:733ebb9dbf22a401c0127a0c80193ca Xin Zhao Karma Caitlyn Syndra Thresh Lulu nan nan nan nan nan 1713 0 2 5 6 9 19 0 0 0 0 1 0 1 0 0.3152 0.9807 nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan 0 0 nan nan nan nan nan nan nan 0 1 11765 412.084 0.208009 944.273 650.158 6 0.2102 18 0.6305 6 48 1.6813 9138 5368 188.021 0.19022 8750 nan nan 148 33 115 nan nan 5.1839 3429 3484 58 2944 3052 63 485 432 -5 1 2 0 0 0 1 5366 5320 89 4825 5595 100 541 -275 -11 2 3 2 0 5 1 6854 7193 116 6708 8275 142 146 -1082 -26 2 3 2 1 5 1 8254 8958 135 7833 9861 163 421 -903 -28 2 4 2 1 5 1
ESPORTSTMNT01_2690210 complete nan LCKC 2022 Spring 0 2022-01-10 07:44:08 1 12.01 3 Blue mid Feisty oe:player:d1ae0e2f9f3ac1e0e0cdcb86504ca77 BRION Challengers oe:team:733ebb9dbf22a401c0127a0c80193ca LeBlanc Karma Caitlyn Syndra Thresh Lulu nan nan nan nan nan 1713 0 2 2 3 9 19 0 0 0 0 0 0 0 0 0.3152 0.9807 nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan 0 0 nan nan nan nan nan nan nan 0 0 14258 499.405 0.252086 581.646 227.776 19 0.6655 7 0.2452 7 29 1.0158 9715 5945 208.231 0.210665 8725 nan nan 193 177 16 nan nan 6.7601 3283 4556 81 3121 4485 81 162 71 0 0 1 0 0 0 1 5118 6942 120 5593 6789 119 -475 153 1 0 3 0 3 3 2 6511 8786 157 6973 9056 154 -462 -270 3 0 3 0 3 4 2 8312 10537 182 8461 10761 187 -149 -224 -5 1 3 0 3 4 3
ESPORTSTMNT01_2690210 complete nan LCKC 2022 Spring 0 2022-01-10 07:44:08 1 12.01 4 Blue bot Gamin oe:player:998b3e49b01ecc41eacc392477a98cf BRION Challengers oe:team:733ebb9dbf22a401c0127a0c80193ca Samira Karma Caitlyn Syndra Thresh Lulu nan nan nan nan nan 1713 0 2 4 2 9 19 0 0 0 0 1 0 1 0 0.3152 0.9807 nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan 0 0 nan nan nan nan nan nan nan 0 0 11106 389.002 0.196358 463.853 218.879 12 0.4203 6 0.2102 4 25 0.8757 10605 6835 239.405 0.242201 10425 nan nan 226 208 18 nan nan 7.9159 3600 3103 78 3304 2838 90 296 265 -12 1 1 0 0 0 0 5461 4591 115 6254 5934 149 -793 -1343 -34 2 1 2 3 3 0 7306 6266 153 8516 8611 223 -1210 -2345 -70 2 1 2 3 4 0 9356 8287 199 10644 10292 284 -1288 -2005 -85 2 1 2 3 4 0
ESPORTSTMNT01_2690210 complete nan LCKC 2022 Spring 0 2022-01-10 07:44:08 1 12.01 5 Blue sup Loopy oe:player:e9741b3a238723ea6380ef2113fae63 BRION Challengers oe:team:733ebb9dbf22a401c0127a0c80193ca Leona Karma Caitlyn Syndra Thresh Lulu nan nan nan nan nan 1713 0 1 5 6 9 19 0 0 0 0 1 1 0 0 0.3152 0.9807 nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan 0 0 nan nan nan nan nan nan nan 0 0 3663 128.301 0.0647631 475.026 490.123 29 1.0158 14 0.4904 11 69 2.4168 6678 2908 101.856 0.103054 6395 nan nan 42 42 0 nan nan 1.4711 2678 2161 16 2150 2748 15 528 -587 1 1 1 0 0 0 1 3836 3588 28 3393 4085 21 443 -497 7 1 2 2 0 6 2 4785 4776 33 4371 5679 25 414 -903 8 1 2 2 0 7 2 5840 6424 39 5341 6738 27 499 -314 12 1 3 2 0 7 2

Data Cleaning and Exploratory Data Analysis

Data Cleaning

We removed rows on individual statistics from our data as our model only uses teamwide data and performed data imputation, as below.

Imputation

We imputed missing values as NaN instead of the mean or median. In this dataset, missing values occurred when (1) our dataset is incomplete and 20-minute data was unavailable, and (2) when games ended before 20 minutes. In both situations, filling in missing values would be inappropriate and change the meaning of the missing values, so they were left as is.

We are left with the following data (preview) after cleaning:

gameid side position playerid teamname teamid champion ban1 ban2 ban3 ban4 ban5 pick1 pick2 pick3 pick4 pick5 gamelength result kills deaths assists doublekills triplekills quadrakills pentakills firstblood firstbloodkill firstbloodassist firstbloodvictim team kpm ckpm firstdragon dragons opp_dragons elementaldrakes opp_elementaldrakes infernals mountains clouds oceans chemtechs hextechs dragons (type unknown) elders opp_elders firstherald heralds opp_heralds void_grubs opp_void_grubs firstbaron barons opp_barons firsttower towers opp_towers firstmidtower firsttothreetowers turretplates opp_turretplates inhibitors opp_inhibitors damagetochampions dpm damageshare damagetakenperminute damagemitigatedperminute wardsplaced wpm wardskilled wcpm controlwardsbought visionscore vspm totalgold earnedgold earned gpm earnedgoldshare goldspent gspd gpr total cs minionkills monsterkills monsterkillsownjungle monsterkillsenemyjungle cspm goldat10 xpat10 csat10 opp_goldat10 opp_xpat10 opp_csat10 golddiffat10 xpdiffat10 csdiffat10 killsat10 assistsat10 deathsat10 opp_killsat10 opp_assistsat10 opp_deathsat10 goldat15 xpat15 csat15 opp_goldat15 opp_xpat15 opp_csat15 golddiffat15 xpdiffat15 csdiffat15 killsat15 assistsat15 deathsat15 opp_killsat15 opp_assistsat15 opp_deathsat15 goldat20 xpat20 csat20 opp_goldat20 opp_xpat20 opp_csat20 golddiffat20 xpdiffat20 csdiffat20 killsat20 assistsat20 deathsat20 opp_killsat20 opp_assistsat20 opp_deathsat20 goldat25 xpat25 csat25 opp_goldat25 opp_xpat25 opp_csat25 golddiffat25 xpdiffat25 csdiffat25 killsat25 assistsat25 deathsat25 opp_killsat25 opp_assistsat25 opp_deathsat25
ESPORTSTMNT01_2690210 Blue top oe:player:38e0af7278d6769d0c81d7c4b47ac1e BRION Challengers oe:team:733ebb9dbf22a401c0127a0c80193ca Renekton Karma Caitlyn Syndra Thresh Lulu nan nan nan nan nan 1713 0 2 3 2 0 0 0 0 0 0 0 0 0.3152 0.9807 nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan 0 0 nan nan nan nan nan nan nan 0 0 15768 552.294 0.278784 1072.4 777.793 8 0.2802 6 0.2102 5 26 0.9107 10934 7164 250.928 0.253859 10275 nan nan 231 220 11 nan nan 8.0911 3228 4909 89 3176 4953 81 52 -44 8 0 0 0 0 0 0 5025 7560 135 4634 7215 121 391 345 14 0 1 0 0 1 0 6506 9853 172 6338 10200 171 168 -347 1 0 1 1 0 2 0 8462 11754 212 7857 12279 203 605 -525 9 0 1 1 0 2 0
ESPORTSTMNT01_2690210 Blue jng oe:player:637ed20b1e41be1c51bd1a4cb211357 BRION Challengers oe:team:733ebb9dbf22a401c0127a0c80193ca Xin Zhao Karma Caitlyn Syndra Thresh Lulu nan nan nan nan nan 1713 0 2 5 6 0 0 0 0 1 0 1 0 0.3152 0.9807 nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan 0 0 nan nan nan nan nan nan nan 0 1 11765 412.084 0.208009 944.273 650.158 6 0.2102 18 0.6305 6 48 1.6813 9138 5368 188.021 0.19022 8750 nan nan 148 33 115 nan nan 5.1839 3429 3484 58 2944 3052 63 485 432 -5 1 2 0 0 0 1 5366 5320 89 4825 5595 100 541 -275 -11 2 3 2 0 5 1 6854 7193 116 6708 8275 142 146 -1082 -26 2 3 2 1 5 1 8254 8958 135 7833 9861 163 421 -903 -28 2 4 2 1 5 1
ESPORTSTMNT01_2690210 Blue mid oe:player:d1ae0e2f9f3ac1e0e0cdcb86504ca77 BRION Challengers oe:team:733ebb9dbf22a401c0127a0c80193ca LeBlanc Karma Caitlyn Syndra Thresh Lulu nan nan nan nan nan 1713 0 2 2 3 0 0 0 0 0 0 0 0 0.3152 0.9807 nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan 0 0 nan nan nan nan nan nan nan 0 0 14258 499.405 0.252086 581.646 227.776 19 0.6655 7 0.2452 7 29 1.0158 9715 5945 208.231 0.210665 8725 nan nan 193 177 16 nan nan 6.7601 3283 4556 81 3121 4485 81 162 71 0 0 1 0 0 0 1 5118 6942 120 5593 6789 119 -475 153 1 0 3 0 3 3 2 6511 8786 157 6973 9056 154 -462 -270 3 0 3 0 3 4 2 8312 10537 182 8461 10761 187 -149 -224 -5 1 3 0 3 4 3
ESPORTSTMNT01_2690210 Blue bot oe:player:998b3e49b01ecc41eacc392477a98cf BRION Challengers oe:team:733ebb9dbf22a401c0127a0c80193ca Samira Karma Caitlyn Syndra Thresh Lulu nan nan nan nan nan 1713 0 2 4 2 0 0 0 0 1 0 1 0 0.3152 0.9807 nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan 0 0 nan nan nan nan nan nan nan 0 0 11106 389.002 0.196358 463.853 218.879 12 0.4203 6 0.2102 4 25 0.8757 10605 6835 239.405 0.242201 10425 nan nan 226 208 18 nan nan 7.9159 3600 3103 78 3304 2838 90 296 265 -12 1 1 0 0 0 0 5461 4591 115 6254 5934 149 -793 -1343 -34 2 1 2 3 3 0 7306 6266 153 8516 8611 223 -1210 -2345 -70 2 1 2 3 4 0 9356 8287 199 10644 10292 284 -1288 -2005 -85 2 1 2 3 4 0
ESPORTSTMNT01_2690210 Blue sup oe:player:e9741b3a238723ea6380ef2113fae63 BRION Challengers oe:team:733ebb9dbf22a401c0127a0c80193ca Leona Karma Caitlyn Syndra Thresh Lulu nan nan nan nan nan 1713 0 1 5 6 0 0 0 0 1 1 0 0 0.3152 0.9807 nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan 0 0 nan nan nan nan nan nan nan 0 0 3663 128.301 0.0647631 475.026 490.123 29 1.0158 14 0.4904 11 69 2.4168 6678 2908 101.856 0.103054 6395 nan nan 42 42 0 nan nan 1.4711 2678 2161 16 2150 2748 15 528 -587 1 1 1 0 0 0 1 3836 3588 28 3393 4085 21 443 -497 7 1 2 2 0 6 2 4785 4776 33 4371 5679 25 414 -903 8 1 2 2 0 7 2 5840 6424 39 5341 6738 27 499 -314 12 1 3 2 0 7 2

Univariate Analysis

We first analyzed the distribution of gold earned per minute for the 2 teams.

Gold reflects how much advantage a team has over the other, equipment wise. We plotted a histogram of gold per minute for both teams of every game, and the result is a bimodal distribution, which hints that we are able to discern whether a team won or lost from their gold per minute. This makes sense since for each game, the winning team will have more gold than the losing team. The temporal difference is made up for by finding the rate of gold earned.

Having quantified the resource advantage, we then analyzed the map control advantage using team vision scores.

Univariate Analysis

Vision score reflects how clearly a team can see the others’ movements through the fog of war, with higher score meaning better vision. We plotted box plots for the distribution of vision scores based on whether a team won or lost, and we see that a team’s vision score can hint at whether a team won. The similar results make sense, as pro teams understand the great importance of vision, and the small increase the winning team has can be attributed to the greater map control that being ahead allows, enabling the winning team to get more wards down deeper on the map.

Interesting Aggregates

We grouped the team data by whether teams won and lost:

kills deaths team kpm elementaldrakes elders firstherald heralds firstbaron barons firsttower damagetochampions                        
9.37182 19.6339 0.291164 1.42961 0.0219595 0.417042 0.76708 0.14283 0.22201 0.316535 59724.8   19.6138 9.40944 0.641958 2.95844 0.0853041 0.582676 1.20646 0.804711 1.12447 0.683465 73721

We can see a clear pattern that winning teams have higher means in each of the performance metrics selected, lending to the idea that it is possible to have a decent accuracy in whether a team won or lost a game based on their performance metrics.

Framing a Prediction Problem

Through our data analysis, we found a clear pattern that the winning team has higher values in the performance metrics we selected. Hence, we ask the question: What will be the result of the game based on team-aggregate game state at 20 minutes? The 20-minute threshold is a pivotal moment in professional play, often marking the transition from early-game skirmishes to larger, decisive teamfights and map objectives. By investigating the relationship between early-game advantages and eventual match results, we want to see how reliably teams can convert leads into wins at the highest level of play.

Our classifier will perform binary classification, with the response variable being outcome aka. whether the team won the match. Accuracy was chosen as the evaluation metric as it provides a clear measure of our model’s performance, with outcome having a 50-50 split between winning and losing.

Baseline Model

Our baseline model is trained using Logistic Regression using the features outlined in the introduction. side is a nominal feature, which we performed one hot encoding on to convert to numerical data. The rest of our features, such as gold and vision score, were quantative. The dataset was split 80:20 for training and testing.

We fit our performance using Grid Search. After fitting the model, the baseline model achieved an accuracy of 0.7780. This demonstrates a good baseline with a significant accuracy for predicting game outcomes given the first 20 minutes of gameplay.

In the next section, we will refine this model through feature engineering.

Final Model

We add kill_diff_10, kill_diff_15, kill_diff_20to our Logistic Regression baseline model, since just the number of team kills alone are not enough to describe a team’s lead- for example, a 10-10 score line of each team’s kills is greatly different from a 10-0 score line, even though the first team has the same number of kills and gold.

These enhancements led to better predictive accuracy compared to the baseline model, with a final testing accuracy of 0.7782, demonstrating slight improvement over the baseline model. We yield this using optimal parameters of C=10 and penalty=l2 with best CV accuracy of 0.7986