I had the pleasure of attending the Ottawa Hockey Analytics Conference over this past weekend. It was an absolute blast and I recommend attending one if you get the chance.
Before I dive into the presentations I first want to take the chance to say thank you to the organizers, specifically Rob Vollman and Michael Schuckers. I had the opportunity to meet both while I was there and they are both incredibly nice and willing to chat with everyone. I’d also like to give a shout out to the venue, Carlton University. I think we were very lucky to get a room this weekend as it appeared Carlton students were right in the middle of mid-terms.
Okay on to the Conference.
The day started out with a great presentation from Vollman on the history of hockey analytics. He went as far back as 1940, and showed when certain stats came into existence (plus minus and save percentage). He then discussed more recent history with himself, Ian Fyffe, Tom Awad, etc and how they were the first to start talking/blogging about hockey on the internet. He then talked about the future with Sport Vision and tracking technology.
The “first period” of the conference was on hockey data resources. David Johnson, Andrew Thomas and Josh Weissbock each did a presentation of where they get the data for their respective sites. What was interesting is how many different areas the data comes from. Johnson talked about how he had at least four different places his site gets data from. He discussed the difficulties in doing this as the NHL data is far from perfect. There are inconsistencies in player names (eg. Matt, Mathew) and occasionally large over sites on tracking who is on the ice. This mostly surrounded goalies where he said there have been entire periods where no goalie has been recorded to have been on the ice.
For all the difficulties Johnson and Thomas have to deal with, they are nothing compared to what Josh has to put up with. I’ve mentioned his site CHLstats.com on here numerous times so I was really interested in the behind the scenes look at how he does it. He doesn’t have the real time stats so his data is limited to just game sheets. The big issues with this are the sites he uses (particularly WHL.ca) are terribly designed making his job much more difficult. Another issue that I found very interesting was that height data cannot be trusted. He said he found most players tend to be either 5’10 or 6’0, almost no players are listed at 5’11. Despite all these hurdles to climb, Josh still has big plans for the future. He hopes to add quality of competition and quality of teammates to his site in the near future.
The “second period” had some great work on shot quality done by Sam Ventura and Matt Cane. Sam briefly discussed the limitations of Corsi in that they count all shots as equal, and the limitations of scoring chances in how they ignore a large portion of shots. He combined these and did some math to come up with expected goals. This is a very repeatable stat; the correlation between expected goals and future expected goals was 0.594. It also correlated nicely to actual GF% an R^2 of 0.411. This actually is better than Corsi which only has a 0.408 correlation.
Matt Cane then presented on weighted shots. He is building upon the work Tom Tango did. The first thing Matt talked about was having to adjust for score effects and score keeper bias. He also stated the need to split up forwards and defencemen. This point was key as weighted shots were repeatable year to year and predictive of future goals for forwards but not for defencemen. Weighted shots for forwards are also more repeatable and predicable than score adjusted Corsi.
The “long change” portion of the Conference focused in on specific teams. Steve Burtch talked about the Leafs and Sens and coaching changes, friend of the blog Emmanuel Perry talked about the Sens, Microstats and Blue Line Data and Andrew Berkshire talked about the Canadiens and PK Subban.
Burtch looked at usage and how performance shifted after a coaching change. He showed player usage charts (PUC) of the Leafs under Carlyle and then under Horachek to give us an idea how players usage has changed. He did the same thing with the Sens looking at the usage under Maclean and then Cameron. He talked about how the performance has gotten better for each team and used to the PUC to explain why.
Emmanuel discussed the project he had been working on over the past year. He tracked every blue line play for the Sens for the past season. Every time the puck crosses over the blue line whether it be an entry or an exit he was tracking it. He said there are about 400-500 of these events a game. He showed how Cody Ceci had a poor Corsi and explained that he was the worst Senators d man at preventing controlled entries. This was one of the best presentations of the whole day. I was so drawn in by it that I completely forgot to write notes on it. I hope the presentation slides become available on online so we can all get a chance to see his great work.*
*editors note: Emmanuel has posted his presentation to youtube, which can be seen here
Berkshire had a great presentation breaking down PK Subban. He talked about how the Canadiens coaching staff has tried to change Subban’s style of play to make him a better defencemen. What baffled me about that was they decided to do this the year after he won the Norris Trophy. The Canadiens thought that Subban was a defensive liability. What Berkshire showed was that in getting Subban to play “safe hockey” (chips off the glass, dumps in etc), he has actually become a worse defensive player.
The “third period” had Tom Awad talking about the future of hockey analytics, Timo Seppa applying fancy stats to the NCAA and Michael Schuckers talking about Central Scouting and the draft. Awad’s presentation was similar to that of Vollmans in talking about Sport Vision. He discussed how this is going to drive the future of hockey analytics not just from the new data we will be able to get but also to improve on the data we already have. For example shot location data is all done manually, so many errors have occurred with shots being taken from crowd as an example. Tracking technology will eliminate all those mistakes.
Timo has been tracking zone entries for Quinnipac University. He talked about the importance of gaining the zone with control and how important it is for defencemen to break up those entries. He showed how a couple defencemen were perceived as being not great defensively actually excelled preventing entries (all players were anonymous).
Finally Schuckers capped off the day with a great presentation on Central Scouting (CSS) and the NHL draft. He talked about CSS as being the “market” and if NHL scouts could beat the market. Turns out scouts can beat the market however only early in the draft. For the first 20 picks or so CSS and Scouts each do a similarly good job at picking players (based on future GP and TOI). From around pick 20 to pick 100 the scouts were better, and from about pick 180 on CSS was better. Schuckers than estimated that value scouts provide is around 2M dollars, and that value comes in rounds 2 and 3.
Overall this was an amazing experience. I got to meet some great people who are driving the hockey analytics movement as well as some awesome people who like myself are along for ride. The weekend was full of jokes and debates, I have so many topics and ideas I want to explore further. Be on the lookout here for that.
Follow me on twitter @PaulBerthelot