First up, killing the Supreme Court.  Again.  But still with numbers and statistics, because that's the best way to do things.  Assume the Senate decides to stop being dumb.  Then, Merrick Garland gets a hearing and since he's basically fine, he gets a seat on the Supreme Court.  Since my least favorite justice is dead.
Wednesday, 30 March 2016
Sunday, 27 March 2016
Final final four
Today's also the last day I update the sports stuff for this year.  Here's the table for the rest of the tournament:
| #Bracket | N_R1 | PP_R1 | Nwrong_R1 | P_R1 | S_R1 | N_R2 | PP_R2 | Nwrong_R1 | P_R2 | S_R2 | 
| Mine | 32 | 1 | 6 | 26 | .995 | 16 | 2 | 4 | 50 | .998 | 
| Heart-of-the-cards | 32 | 1 | 10 | 22 | .656 | 16 | 2 | 8 | 38 | .320 | 
| Julie | 32 | 1 | 10 | 22 | .656 | 16 | 2 | 6 | 42 | .738 | 
| BHO | 32 | 1 | 9 | 23 | .823 | 16 | 2 | 6 | 43 | .820 | 
| 538 | 32 | 1 | 8 | 24 | .928 | 16 | 2 | 7 | 42 | .738 | 
| Rank | 32 | 1 | 13 | 19 | .129 | 16 | 2 | 6 | 39 | .424 | 
| #Bracket | N_R3 | PP_R3 | Nwrong_R3 | P_R3 | S_R3 | N_R4 | PP_R4 | Nwrong_R4 | P_R4 | 
| Mine | 8 | 4 | 3 | 70 | .998903 | 4 | 8 | 3 | 78 | 
| Heart-of-the-cards | 8 | 4 | 6 | 46 | .044 | 4 | 8 | 4 | 46 | 
| Julie | 8 | 4 | 4 | 58 | .610 | 4 | 8 | 4 | 58 | 
| BHO | 8 | 4 | 4 | 59 | .674 | 4 | 8 | 3 | 67 | 
| 538 | 8 | 4 | 2 | 66 | .955 | 4 | 8 | 2 | 82 | 
| Rank | 8 | 4 | 2 | 63 | .875 | 4 | 8 | 3 | 71 | 
| #Bracket | N_R3 | PP_R3 | Nwrong_R3 | P_R3 | N_R4 | PP_R4 | Nwrong_R4 | P_R4 | 
| Mine | 2 | 16 | 2 | 78 | 1 | 32 | 1 | 78 | 
| Heart-of-the-cards | 2 | 16 | 2 | 46 | 1 | 32 | 1 | 46 | 
| Julie | 2 | 16 | 2 | 58 | 1 | 32 | 1 | 58 | 
| BHO | 2 | 16 | 1+ | 67+ | 1 | 32 | 1 | 67+ | 
| 538 | 2 | 16 | 2 | 82 | 1 | 32 | 1 | 82 | 
| Rank | 2 | 16 | 2 | 71 | 1 | 32 | 1 | 71 | 
If the President gets his pick correct in the next round, then he'll win with an 83.  Otherwise, 538 wins based on only getting two wrong in round 4.  Everything else is locked in now, so there's nothing really to update anymore.
Friday, 25 March 2016
Round 3
Since it's the weekend, it's sports time.  First up, my picks for this round of things:
Texas A&M:
This now has the added columns of S_RX. These are my simulated CDF values based on the Yahoo selection pick fractions given for each team. This is another piece of kind-of garbage code that I threw together earlier in the week. I think it's doing everything correctly, but I don't see any simulated results that get a total score above 83, and yahoo does list some in their leader list. Maybe 1e6 simulations isn't sufficient to fully probe things? Maybe I'm truncating or rounding something odd? The main idea behind this calculation is to see how well a given set of picks should rank.
![]()  | 
| One that I was doomed to get wrong. | 
![]()  | 
| And the other doomed one. But a new mistake! | 
Texas A&M:
29.687500       14.062500       3.125000                3       6       3       Texas A&M
28.125000       18.750000       21.875000               3       8       2       Oklahoma
First up, I think my analysis notes have been wrong on the previous posts.  The file I'm pulling these numbers from is in 2016/2015/2014/group/game/rank/name format, not 2014/2015/2016 format.  This changes the analysis for some of my previous mistakes, but I'm too lazy to go correct those.  In any case, using this new, correct information, it looks like I thought (from the 2016 ratings) that Texas A&M should be slightly better than Oklahoma.  Folding in previous years could have potentially altered that choice.
I was thinking a bit about adding some score-based information in as well.  The idea being that each team scores a given median number of points across all their games, and have a given median number of points scored against them.  By comparing how well a given score ranks in all their games, and against their opponent's, it should be possible to construct offense and defense ratings.  This might be useful to say, "Team X is generally better, but they only are a +1 in offense, and they're playing a +4 defense, so they might not win."  The other benefit would be to add two new metrics, which could then be used across the full multi-year dual-gender score set to determine which relative weights each should be assigned to a more complete prediction model.
I think the first step that I should do, though, is to dump all of that data into a database, instead of using horrible fixed-width formatted files to manage things.  That's largely a consequence of not really caring a lot about the project.
In any case, here's the comparison table for round three:
| #Bracket | N_R1 | PP_R1 | Nwrong_R1 | P_R1 | S_R1 | N_R2 | PP_R2 | Nwrong_R1 | P_R2 | S_R2 | 
| Mine | 32 | 1 | 6 | 26 | .995 | 16 | 2 | 4 | 50 | .998 | 
| Heart-of-the-cards | 32 | 1 | 10 | 22 | .656 | 16 | 2 | 8 | 38 | .320 | 
| Julie | 32 | 1 | 10 | 22 | .656 | 16 | 2 | 6 | 42 | .738 | 
| BHO | 32 | 1 | 9 | 23 | .823 | 16 | 2 | 6 | 43 | .820 | 
| 538 | 32 | 1 | 8 | 24 | .928 | 16 | 2 | 7 | 42 | .738 | 
| Rank | 32 | 1 | 13 | 19 | .129 | 16 | 2 | 6 | 39 | .424 | 
| #Bracket | N_R3 | PP_R3 | Nwrong_R3 | P_R3 | S_R3 | N_R4 | PP_R4 | Nwrong_R4 | P_R4 | S_R4 | 
| Mine | 8 | 4 | 3 | 70 | .998903 | 4 | 8 | |||
| Heart-of-the-cards | 8 | 4 | 6 | 46 | .044 | 4 | 8 | |||
| Julie | 8 | 4 | 4 | 58 | .610 | 4 | 8 | |||
| BHO | 8 | 4 | 4 | 59 | .674 | 4 | 8 | |||
| 538 | 8 | 4 | 2 | 66 | .955 | 4 | 8 | |||
| Rank | 8 | 4 | 2 | 63 | .875 | 4 | 8 | |||
This now has the added columns of S_RX. These are my simulated CDF values based on the Yahoo selection pick fractions given for each team. This is another piece of kind-of garbage code that I threw together earlier in the week. I think it's doing everything correctly, but I don't see any simulated results that get a total score above 83, and yahoo does list some in their leader list. Maybe 1e6 simulations isn't sufficient to fully probe things? Maybe I'm truncating or rounding something odd? The main idea behind this calculation is to see how well a given set of picks should rank.
Sunday, 20 March 2016
Round 2
today was the end of round two of the sports thing.  I also need to go back and update posts with the new label I've decided is probably useful, "sports".  So I updated everything before the final game was over, and then had to double check nothing went wrong:
Again, three of my four mistakes this time around were caused by my winning choice being eliminated in the previous round. For the last one:
Xavier:
12.500000 42.187500 29.687500 2 7 7 Wisconsin
Why did I choose Xavier?  Did I get confused and use the 2014 rankings instead of the 2016 ones?  This looks like me being dumb.  Maybe I took the #2 ranking too seriously?  I should probably write down logic notes next time, so I can point to the error directly.
What does the scoring comparison look like?
Again, three of my four mistakes this time around were caused by my winning choice being eliminated in the previous round. For the last one:
Xavier:
12.500000 42.187500 29.687500 2 7 7 Wisconsin
34.375000       12.500000       14.062500               2       8       2       Xavier
What does the scoring comparison look like?
| #Bracket | N_R1 | PP_R1 | Nwrong_R1 | P_R1 | N_R2 | PP_R2 | Nwrong_R1 | P_R2 | 
| Mine | 32 | 1 | 6 | 26 | 16 | 2 | 4 | 50 | 
| Heart-of-the-cards | 32 | 1 | 10 | 22 | 16 | 2 | 8 | 38 | 
| Julie | 32 | 1 | 10 | 22 | 16 | 2 | 6 | 42 | 
| BHO | 32 | 1 | 9 | 23 | 16 | 2 | 6 | 43 | 
| 538 | 32 | 1 | 8 | 24 | 16 | 2 | 7 | 42 | 
| Rank | 32 | 1 | 13 | 19 | 16 | 2 | 6 | 39 | 
Again the "rank" method is garbage, and shouldn't be used.  Nate Silver had a tweet earlier about how this is apparently because it's based on RPI too much.  Looking at wikipedia, it looks like RPI is an incomplete version of my LAM method.  ¯\_(ツ)_/¯  This also shows the point where HotC totally falls apart, becoming the worst method.  Everyone else is pretty well clumped together.  I'm a bit surprised that 538 isn't doing better, given the "we included scores, and at-home values, and distances to the games, and the number of cats each player owns, and the SAT scores of each player."
This also makes me think I should have actually entered my selections into some pool.  Maybe I should hone the method a bit more, and see how it works over a few more years.  Or, alternatively, I could do the reasonably easy thing and apply the method to the historical data, and see if this consistently matches reality.  Maybe next weekend, since I think it's a long one.  This will also make me fix my master Makefile to put things into logical directories, and not just dump the outputs into a common directory.
Friday, 18 March 2016
Round one
Statistics results.
![]()  | 
| Ok, that West Virginia loss is going to hit the later rounds. | 
![]()  | 
| As is Purdue. Not as bad as Michigan State, obviously. | 
Let's look at the comparison table:
| #Bracket | N_R1 | PP_R1 | Nwrong_R1 | P_R1 | 
| Mine | 32 | 1 | 6 | 26 | 
| Heart-of-the-cards | 32 | 1 | 10 | 22 | 
| Julie | 32 | 1 | 10 | 22 | 
| BHO | 32 | 1 | 9 | 23 | 
| 538 | 32 | 1 | 8 | 24 | 
| Rank | 32 | 1 | 13 | 19 | 
The columns are the bracket identifier, the number of games in the round, the points per correct selection in the round, the number wrong, and the total points.  The brackets are mine above, the "Heart of the Cards" bracket taken by simply selecting teams based on the 2016 ranking I calculated, Julie's bracket, President Obama's, the 538 bracket taken by assuming constant composite rankings from their pre-tournament predictions, and a dummy bracket constructed by selecting teams based solely on their "sport rank" thing.  That's actually working out a lot better than I expected.  I was correct in shaking up the straight HotC numbers with a bit of historical data.  Looking at the mistakes:
Arizona:
26.562500       43.750000       40.625000               1       5       6       Arizona
25.000000       37.500000       53.125000               5       1       11      Wichita St
I didn't believe the numbers, given the #11 ranking.  From above, I should ignore the ranking in the future, because it's pretty crappy.  The problem is that my numbers suggest that Wichita State is the best team in the entire thing, which doesn't seem like it's right.
West Virginia:
28.125000       21.875000       1.562500                2       6       3       West Virginia
34.375000       39.062500       45.312500               2       6       14      SF Austin
Ditto.  My numbers predict that SF Austin is the second best team.  I guess if either of them come out winning, I can say that I predicted it, and then tossed it in the trash.
Baylor:
17.187500       23.437500       20.312500               3       3       5       Baylor
25.000000       18.750000       7.812500                3       3       12      Yale
No clue, but it sounds like everyone was surprised by this one.
Purdue:
29.687500       14.062500       -3.125000               4       3       5       Purdue
37.500000       -7.812500       -3.125000               4       3       12      Ark Little Rock
My numbers say they both suck, so I went with last year's numbers to break the tie.  I could have added in the 2014 values, but this was a #12 ranking, and I didn't believe those.
Dayton:
28.125000       26.562500       20.312500               4       7       7       Dayton
9.375000        7.812500        34.375000               4       7       10      Syracuse
This one I should have gotten right.  I folded the two previous years in, and that said that I should trust consistency over a sudden jump.  Maybe Syracuse has some new great player.
Michigan State:
35.937500       18.750000       28.125000               4       8       2       Michigan St
23.437500       3.125000        23.437500               4       8       15      MTSU
Again, this one seemed like it was a surprise to everyone.  There are only three values of my ranking between these two values, so that kind of suggests they're within ~5% of each other in terms of skill.  Oh well.
Tuesday, 15 March 2016
I didn't really do any of the improvements I discussed two years ago.
Basketball
Basically my solution this time was:
Basically my solution this time was:
- Check with Julie that no team went undefeated this year. That was my big problem last time.
 - Run the 2016, 2015, and 2014 game solutions to determine the relative rankings for all of the teams in each of those years.
 - Rank things based on the 2016 solution, letting 2015 solutions break ties. Also use this information (and the 2014) for solutions to:
 - Anytime a #12 sport-ranked team is ranked substantially above a 1-4 sport-ranked team, assume something is off with the model, because I have a lot of #12 ranked teams ranked really high for some reason.
 
So the result table is:
#2014.score     2015.score      2016.score    group   game  sport-rank  2016.score      Team.name
23.437500       28.125000       40.625000       1       1       1       40.625000       Kansas
-9.375000       -21.875000      1.562500        1       1       16      1.562500        Austin Peay
18.750000       -3.125000       17.187500       1       2       8       17.187500       Colorado
29.687500       7.812500        20.312500       1       2       9       20.312500       Connecticut
3.125000        32.812500       26.562500       1       3       5       26.562500       Maryland
9.375000        20.312500       29.687500       1       3       12      29.687500       S Dakota St
10.937500       4.687500        20.312500       1       4       4       20.312500       California
14.062500       14.062500       34.375000       1       4       13      34.375000       Hawaii
40.625000       43.750000       26.562500       1       5       6       26.562500       Arizona
NAN     NAN     NAN     1       5       11      NAN     VAN/WICH
1.562500        18.750000       28.125000       1       6       3       28.125000       Miami FL
14.062500       21.875000       9.375000        1       6       14      9.375000        Buffalo
12.500000       15.625000       17.187500       1       7       7       17.187500       Iowa
-20.312500      23.437500       15.625000       1       7       10      15.625000       Temple
37.500000       46.875000       37.500000       1       8       2       37.500000       Villanova
3.125000        -1.562500       17.187500       1       8       15      17.187500       UNC Asheville
21.875000       20.312500       34.375000       2       1       1       34.375000       North Carolina
NAN     NAN     NAN     2       1       16      NAN     FGCU/FDU
-15.625000      -12.500000      14.062500       2       2       8       14.062500       USC
18.750000       17.187500       20.312500       2       2       9       20.312500       Providence
3.125000        10.937500       28.125000       2       3       5       28.125000       Indiana
4.687500        18.750000       37.500000       2       3       12      37.500000       Chattanooga
20.312500       53.125000       26.562500       2       4       4       26.562500       Kentucky
18.750000       17.187500       31.250000       2       4       13      31.250000       Stony Brook
-3.125000       37.500000       15.625000       2       5       6       15.625000       Notre Dame
NAN     NAN     NAN     2       5       11      NAN     MICH/TULSA
1.562500        21.875000       28.125000       2       6       3       28.125000       West Virginia
45.312500       39.062500       34.375000       2       6       14      34.375000       SF Austin
29.687500       42.187500       12.500000       2       7       7       12.500000       Wisconsin
25.000000       6.250000        15.625000       2       7       10      15.625000       Pittsburgh
14.062500       12.500000       34.375000       2       8       2       34.375000       Xavier
12.500000       -6.250000       26.562500       2       8       15      26.562500       Weber St
21.875000       25.000000       34.375000       3       1       1       34.375000       Oregon
NAN     NAN     NAN     3       1       16      NAN     HC/SOUTH
23.437500       -7.812500       29.687500       3       2       8       29.687500       St Joseph's PA
32.812500       18.750000       18.750000       3       2       9       18.750000       Cincinnati
20.312500       23.437500       17.187500       3       3       5       17.187500       Baylor
7.812500        18.750000       25.000000       3       3       12      25.000000       Yale
28.125000       40.625000       20.312500       3       4       4       20.312500       Duke
-21.875000      6.250000        28.125000       3       4       13      28.125000       UNC Wilmington
20.312500       10.937500       12.500000       3       5       6       12.500000       Texas
1.562500        42.187500       15.625000       3       5       11      15.625000       Northern Iowa
3.125000        14.062500       29.687500       3       6       3       29.687500       Texas A&M
26.562500       23.437500       17.187500       3       6       14      17.187500       WI Green Bay
0.000000        4.687500        10.937500       3       7       7       10.937500       Oregon St
28.125000       26.562500       23.437500       3       7       10      23.437500       VA Commonwealth
21.875000       18.750000       28.125000       3       8       2       28.125000       Oklahoma
-9.375000       -7.812500       25.000000       3       8       15      25.000000       CS Bakersfield
34.375000       40.625000       29.687500       4       1       1       29.687500       Virginia
7.812500        -1.562500       17.187500       4       1       16      17.187500       Hampton
-6.250000       -9.375000       10.937500       4       2       8       10.937500       Texas Tech
-4.687500       18.750000       17.187500       4       2       9       17.187500       Butler
-3.125000       14.062500       29.687500       4       3       5       29.687500       Purdue
-3.125000       -7.812500       37.500000       4       3       12      37.500000       Ark Little Rock
29.687500       26.562500       15.625000       4       4       4       15.625000       Iowa St
17.187500       26.562500       18.750000       4       4       13      18.750000       Iona
0.000000        1.562500        26.562500       4       5       6       26.562500       Seton Hall
34.375000       46.875000       29.687500       4       5       11      29.687500       Gonzaga
14.062500       25.000000       28.125000       4       6       3       28.125000       Utah
4.687500        -3.125000       25.000000       4       6       14      25.000000       Fresno St
20.312500       26.562500       28.125000       4       7       7       28.125000       Dayton
34.375000       7.812500        9.375000        4       7       10      9.375000        Syracuse
28.125000       18.750000       35.937500       4       8       2       35.937500       Michigan St
23.437500       3.125000        23.437500       4       8       15      23.437500       MTSU
-1.562500       10.937500       9.375000        5       1       11      9.375000        Vanderbilt
53.125000       37.500000       25.000000       5       1       11      25.000000       Wichita St
14.062500       17.187500       10.937500       5       2       16      10.937500       FL Gulf Coast
-17.187500      -20.312500      6.250000        5       2       16      6.250000        F Dickinson
26.562500       0.000000        15.625000       5       3       11      15.625000       Michigan
14.062500       18.750000       14.062500       5       3       11      14.062500       Tulsa
9.375000        -3.125000       -7.812500       5       4       16      -7.812500       Holy Cross
9.375000        1.562500        15.625000       5       4       16      15.625000       Southern Univ
So, using this, I can answer the following questions I saw while doing the research of "figuring out what FLGU means".
- I saw a thing asking if Holy Cross was underrated. My analysis says "no," and concludes with a "holy crap, no."
 - Kansas is probably going to win it all.
 - Julie was right, Michigan State should have been ranked higher than Virginia.
 - I've already scored the two group 5 games that have played correctly.
 
Using the espn clicky thing to use these rules (I bent rule #4 to also apply to 13-ranked teams as well):
![]()  | 
| Group 1 and 2. | 
![]()  | 
| Group 3 and 4. | 
![]()  | 
| Final stuff. I don't know how to call the score thing. They're separated by ~4 points in the scores, or about 10%. So maybe 10 points, since basketball is a "log10(score) ~ 2" kind of game? | 
For the remaining pre-game things, I have Michigan and Southern University winning those (in addition to the correctly called Wichita State and Florida Gulf Coast).
Subscribe to:
Comments (Atom)












