Model accuracy after predicting 150 fights

Summary / TL;DR

Longtime patrons already know what this blog is about. I've built a model to predict outcomes of UFC fights. I started doing this for entertainment purposes and my passion for MMA, but it also turns out that the model beats oddsmakers and returns a profit of more than 30 % on wagers (*). The model comes with three viable betting strategies.  

Using the Standard method will generate a ~9 percent return on wagers. The Standard method puts wagers on all predicted outcomes.

The Cherry Picking method selects only bets where the predictions are especially one sided. This method generates between ~10-20 percent return on wagers. 

The Good Dog method identifies true favorites among the underdogs and generates ~30 percent return on wagers.

(*) Keep in mind that betting always should be considered for entertainment purposes. Never bet money you can't stand to lose.  

Figure 1. Return on Wagers by method.

The above figures are per season. Per UFC-annum (circa four 75-fight seasons) the rates of return are 30 percent, 40-60 percent and 200 percent if winnings are rewagered each season. Predictions are organized in seasons to keep track of return and prediction accuracy. 

First season spanned June 23 2018 to September 8 2018 - a total of 77 days. 

Second season spanned September 15 2018 to November 30 2018 - a total of 76 days. 

Third season spanned December 1 2018 to February 2 2019 - a total of 63 days.  

An extra TL;DR for the nerds out there: All predictions have been made Out-of-sample on fresh unseen data. Initial model accuracy was validated with cross validation.

Introduction

I've predicted UFC fights since may 2018 now. All together I've predicted 165 fights since UFC 224 Nunes vs Pennington. In order to keep things organized I kept records of the predictions in "seasons" containing circa 50 fights. 

Is it possible to not lose at betting?

In order to win at betting, one must know who are the true favorites and the true underdogs. That's the challenge of betting. The betting companies set the odds, and we guess where the betting companies are right and wrong. It's pretty straight forward and (kind of) fair. The only thing that excludes perfect fairness is the fee the betting company takes on the favorite, the underdog or both. A true "even-money" should return 2 and 2 in decimal odds (+100 in moneyline terms), but usually the Bookmakers even-money is 1.9 and 1.9. Never 2 and 2.  That's the bookmakers fee. 

When converting decimal odds to implied probability, perfect fairness would give us a total of 100. Take a recent example. Ashley Evans-Smith was an underdog at 2.5 and Andrea Lee was a favorite at 1.53. The combined likelihood of those odds are 105 percent. 

Bookmakers are often correct. Following the bookmakers (play every favorite) will enable you to score a "prediction accuracy" of slightly more than 60 percent, but it will give you a negative return on your bets across time (tried this in repeated iteration cross-validations). This means that if you always bet on the favorite, even though you'll mostly win your bets, your winnings will be -5 %. 

Online wagering is a massively profitable business because the bookmakers receive action on every fight. Their risk is spread out, which means they're not dependent on a single bet or bet combination.  It's a patient organization's game geared towards winning across time regardless of betters performance.

My method

I've scraped Wikipedia for all UFC events since 2010 and characteristics from all fighters Wikipedia-pages. This produced a dataset of 3200 UFC-fights. I predict fights using a GLM regression model. This returns the likelihood that a fighter wins the bout. I spread the risk on 50 fights to reduce the amount of randomness affecting the return. I'm considering a move to 75, as this increases stabililty in trials.  

Overall prediction accuracy is 63.5 percent. 

Standard

This method is the most basic. Bet on all the fights. Let 5000 imaginary dollars be the total stakes for the season (50 fights). Spread the stakes on all the fights (5000$/50 = 100$). If there are 52 fights in a season (depending on predictable fights in the last event), then the sum stakes will be 5200$.  

Even in season 3, where the prediction accuracy came in at a low 56 percent, there was a positive net return. Notice that return is dependent on odds, so even though season 2 had higher prediction accuracy, season 1 scored a better return. 


Cherry picking

The next method is the method of Cherry Picking the best fights. After discussing the model in our podcast, a listener suggested "picking the fights that were most likely to be correct". I scoffed at the idea at first, but later I realized that this could be a valid approach, so I tested it. The prediction response in a logistic regression is a value between 0 and 1. Less secure predictions are closer to .5, which is the same as an even-money pick-em. By only wagering on the fights that are close to 0 and 1, we wager on the best predictions of the model.  

Overall prediction accuracy in cross validation increases from 63 percent to 75 percent. The total for the three season in question however is only 62.5 percent. This means more cherries than assumed were rotten.  

As you can see from the table below (and the figure above), Cherry Picking can be rewarding, but is it as stable as the Standard method? 


Good dogs

The Good Dog method is the most profitable. My model predicts differently than oddsmakers, so sometimes there are hidden favorites, or True Favorites, in keeping with the jargon from above.  

In betting-strategizing, we're not actually interested in the overall prediction accuracy, but the return. Prediction accuracy is around 50 percent for our Good Dogs, but that does'nt matter when you're rewarded a minimum of double money for each of those fights.  

As can be seen in the table, about 30 to 40 percent of fights have a Good Dog, a True Favorite. When money is double and prediction accuracy is 50 percent, we can have high confidence in a good return on wagers. There is high prediction stability in both how many Good Dogs there are, and the accuracy across seasons.


Tier Benefits
Recent Posts