College Football Rankings

College Basketball Rankings

Generate March Madness Brackets

My Blog



The Brief Summary

This is an "advanced" ranking system in the sense that it uses strength of schedule which is compounded until the rankings stabilize. It is meant to be purely predictive, which is to say it focuses on making future predictions rather than explaining the past. It uses final scores, dates, and locations of games, and currently nothing more. It uses data from previous seasons until there is enough data in the current season.

Why I started making sports rankings

My interest in sports rankings started with college football in 2005. I was already a programmer in Java (just as a hobby, not by profession), and have made a number of other programs, including a chess AI. Most of them I made for the challenge of it, just to see if I could do it. The sports ranking systems were no different. I wanted to see how good of a system I could make.

My first system was not very good, but it was a decent starting point and provided the basis for the systems to come. I used it to enter (and sometimes win) small bowl pickem pools. Just recently, in summer of 2014, I got the idea to do the same in basketball. I have been picking brackets ever since 2005 as well, and soon after that I wrote a program which would create brackets for me by flipping "smart coins" based on win probabilities. Up until recently the win probabilities were either estimated or just pulled from historic seed upset probabilities. The idea of using some actual power rankings on which probabilities can be better calculated was appealing. This idea has now been realized here .

My system's philosophy

There exists a seemingly endless number of ways to evaluate a team, the season it has had, and the way the team is expected to play in the future. On the one hand you have over a hundred computer systems represented in Massey's Computer Composite, linked at the bottom of the page. It is interesting to click through all the pages and see how so many different people have come to different (and reasonable-sounding) conclusions about how to rank the teams. There are linear regression models, win probability ratio models, ELO type sequential systems, systems which try to break the game down to play-by-play efficiency and simulate games using the data, and countless more.

There also exists human polls, tendencies, and wisdoms. The following jargon has been thrown around constantly on all classes of sports networks:
- "Team X has a problem with finishing games."
- "It was ugly, but all that matters is they got the W today."
- "Team X will stay #1 until they are beaten."
- "Defense wins championships."
- "Team X's tough nonconference schedule will have them ready for conference play."
- "Team X is in for a shock the first time they play an opponent with a pulse."
- "I like Team X in this matchup, coming off a bye week."
- "This is a trap game for Team X, going into their big showdown next week."
- "Team X might struggle today after the big physical and emotional game last week."

There is probably some measure of truth in all of the above statements. The thing is, how much truth is there, and how can it be quantified? One truth of which we can all be certain is that no system can possibly be 100% accurate. Sports are too random, and their outcomes are dependent on too many variables which simply cannot be predicted. This leads to results which simply don't make sense and seemingly cannot be explained. The goal with my system is to take as many factors into account which I can think of which might have an impact on a game, and combine and weight them in such a way that it at least does the best possible job of predicting games in past seasons. If enough past seasons are included in the analysis, then the randomness starts to fall out and real patterns can start to emerge. It is then an empirical approach that I have: Out of all these ranking methods and wisdoms, which ones seem to hold true over history and which ones are myths?

How my system works

The ranking system incorporates 3 main components at the moment - a straight up transitive scoring margin component, similar to linear regression models; a more achievement-based component which values wins more highly, especially those against stronger opponents, and gives a diminishing return on blowout wins; and a transitive "tempo-free" component which worries only about the ratio of points scored between the two teams.

All three components are run through an iterative process which adjusts them according to strength of schedule until the ratings stabilize. Then a linear combination of the three components is taken as the final rating. The weights of the components are taken such that the historic point spread predictions using that combination are as close as possible to the predicted ones. One might say that instead of being purely based on mathematics, it is more a brute-force historical comparison.

My system currently does not use anything in its calculations besides final scores, venues, and the date of the game. The date is important because more recent results are valued higher than distant results. The system uses some data from previous seasons in its rating early in the season also. The basketball system only uses D1 results, and the football system only uses FBS results (other games are still listed in win/loss column, but have no effect on the rankings).

I'd like to add that I think the idea of providing a single list of rating numbers for each team and ranking them that way (and using the numbers to provide predictions) is inherently flawed. While I do provide such numbers on my site and allow their use in making point spread estimates, I need only refer you to the above list of "human network" questions to debunk this. Factors such as emotional wins, trap games, and just plain rock/paper/scissors situations involving team play styles make such a transitive, single list impossible. In addition to all of this, it is just not true that if Team A beats B by 10 points and B beats C by 10 points, that A should beat C by 20 points. The result is typically inflated to become a 40 point blowout instead. There is also a limit on the upper end, as transitive arguments like this might suggest the #1 team would beat the worst team by 150 points, which obviously just does not happen because of subbing in 2nd string, etc. The solution is to post a set of ratings which does its best job, but to have a separate predictor system which uses different logic. This is currently a work in progress and will hopefully have a prototype ready by Fall of 2016.

Thoughts on the future

At the moment the system is still very much a work in progress. I only transferred it to the web in Summer of 2014. I have currently one idea for future development of the system.

The idea is what I call a "layering" approach. This approach would use some combination of the above components to create a "base" rating from which to start, and then apply a different rating approach using those values as the strength of schedule component. For example, I could construct a reasonable base rating, then as a final layer weight only the last N games played with non-zero weights. Another promising "layer" is to weight games higher that are against similar opponents, to mitigate the damage done by 50 and 60 point blowouts against bad teams. Overall I have dozens of ideas for layers that can be added, as well as the idea of applying several layers to the end result.


Thanks should be given to the following:

Ken Pomeroy - For being an early inspiration to me for computer ranking systems, and for fighting the good fight for public acceptance of advanced metrics. Down with RPI!

Ken Pomeroy - For an updated game database for basketball.

Kenneth Massey - For operating a great site and providing a nice introduction to computer ranking systems in general. His Master's thesis, available at that location, was very thought provoking and a good read for anyone interested in this stuff. He also provides easily readable football scores which I use for my rankings.

College Basketball Ranking Composite - Operated by Kenneth Massey, this is a nice place to view and compare different basketball rankings, both computer and human poll based (including my own). There is also a Football Composite.

Bracket Matrix - For hosting my automated basketball bracket predictions (and also just for being awesome and existing).

My Dad, Tom Wilson, for encouraging me in the three passions which collide in making these rankings possible - sports, math, and programming. Thanks to him also for hosting this web domain. BuzzPlugg itself serves as a forum for conversation about beers - see what beers other people like and chime in on one you've had recently that hit the spot! "Plugg" the ones you like, "Nixx" the ones you don't, and the system tracks your favorites. It is all totally free!

James Howell , for providing historical football scores (which are badly needed to do the kind of analysis I need to do!)

I will probably think of others to put here later!


Although I believe the predictive power of my ratings to be quite good, I also acknowledge that sports are inherently random and to some extent unpredictable. Just viewing the upset tables can convince you of that - a team that is a 20 point underdog still has a chance if the cards fall right (and the chances might be higher than most people think!) Even a theoretically perfect rating system will be wrong some percentage of the time. Keep this in mind if you plan to use these ratings for any personal purposes.


If you have any questions or comments about my rankings, methodology, bracketology, or anything else, you can contact me at :