Saturday, September 20, 2008

More Poker Odds

I decided to finish up the poker post I had earlier. After I had calculated a few probabilities, I found a wikipedia site which covers the topic in a much more thorough manner. Last post I calculated the odds of having these starting hands: a pair (5.9%), suited connectors (3.9%), two cards both a 10 or higher (14.3%), and or of having any of these hands (20.6%).

How do these hands relate to the starting hands the other people at the table have? Using the binomial distribution and the hypergeometric distribution, I came up with this table:

The first column is the number of people at the table who have that hand, assuming that there are 10 people at the table. The Pair column says that the probability of exactly two people having a pair is 9.6%. The Ace column says that about 86.7% of the time at least one ace is dealt to the table and about 50% (34.8+13.5+1.8) of the time two or more aces are dealt to the table. So having an ace 2 for a starting hand means that it is about even odds that you do not have the best hand at the table with an ace in it.

The most useful part of the wikipedia page is the section detailing the approximation of hitting outs. Say that you have two spades and the flop gives you two more spades. That leaves 9 more spades in the deck or 9 outs to get a flush. Simply times the number of outs by 4 to get the odds of getting a flush by the river. 4*9 = 36 so 36% chance. This is an approximation with the actual probability being 34.96%. For 10 or more outs after the flop the formula is 3x+9 with x being the number of outs. The approximation on the turn is 2x (or the better approximation of 2x+(2x/10)). Pretty useful and gives you a handy way of calculating pot odds during a hand. 

Before the flop what are the odds of improving the starting hands that I mentioned above? For a pair there is roughly a 11.5% chance of getting three of a kind on the flop, 15% by the turn, and 18.5% by the turn. Chance of four of a kind is .24%, .49%, and .82% respectively. If you have two cards of the same suit it is about a .8% chance of getting a flush on the flop, 2.8% of getting it by the turn, and 5.8% of getting it by the river. The odds of getting a straight or pairing one of your cards can be found using the post-flop approximations.

Sunday, August 10, 2008

Some Numbers from the 2008 Olympics

During the opening ceremonies, the US broadcast displayed a graphic with the size of the delegation and population of the country. I wondered what the correlation was between population and the number of athletes competing and what other variables might influence the delegation size. In addition to population GDP, climate, and some metric measuring civil rights for women might also explain delegation size. When I went to research this I had trouble finding the country and delegate data. What I did find was a list of all the athletes online. I turned the data into an excel file and played around a little with the data. The file can be downloaded here if anyone wants it. It is a .csv, which makes it easy to inport and analyze with R.


I created some pivot tables in Excel. It is possible to do similar things in R with the tapply() function but Excel makes pivot tables so easy to create and alter I did not bother. I uploaded them to a Google spreadsheet and embedded a few of them at the bottom of this post. The rest of the data I mainly found using R. There are 204 countries competing in this Olympics. The largest delegation belongs to the United States, with 618 athletes. 10 countries have one athlete competing: Arba, Belize, Burundi, Central Africa Republic, Dominica, Gabon, Niger, and Nauru (which was featured in a surreal This American Life episode). The mean delegation size is 49.24 athletes. Surprisingly, the median delegation size was only 9. Roughly half the countries send 9 or fewer athletes. Random fact: Only 27 of the 204 delegations have more women competing than men. Which two countries have the largest female-positive (ie more women than men) delegation?


Another column of data on the offical site listed the disciplines (sports) which athletes compete in. There are 38 discipline classifications. The most competed in discipline is Athletics (Track & Field I would guess) with 1943 competitors. Cycling BMX is the smallest event with only 24 competitors. The median and mean are 182.5 (between Baseball and Table Tennis) and 264 respectively. Random fact: there are five sports that are specific to only one gender. Which ones are they?


A little manipulation with R turned the Date of Birth data into Year of Birth and then into an 'age estimate' where I took 2008 and subtracted Year of Birth to get current age. The oldest athlete is Hoketsu Hiroshi, a 67 year-old man representing Japan in Equestrian. The youngest competitor is 12 year-old swimmer Antoinette Joyce Guedia Mouafo from Cameroon. The Median and Mean ages estimates are 26 and 26.37 years. The Random Fact was going to be the average oldest and youngest delegation, but Excel started acting up and I was a bit tired to do more R (date modification can be tricky). There is plenty to explore in this data. Let me know if you find anything neat in the above file and don't be afraid to add more columns of data.


Answers: Norway and Sweden; baseball, softball, boxing, synchronised swimming, rhythmic gymnastics

Saturday, July 26, 2008

Hoops Analyst and R

Rather than write another convuluted post on poker and probability, I decided to play around a bit with the statistics program R. A post by Harlan Schreiber at the much enjoyed site Hoops Analyst and a recently purchased book on R inspired the effort. I highly recommend R and this post will demonstrate some of the things one can do in R, even one inexperienced in R and statistics such as myself.

Mr. Schreiber explores the thinking that defense wins championships and offense is not as important. He presents a table of Offensive and Defensive Ranks of past champions that shows that they are equally important. The data leads to a simple analysis and conclusion: defense and offense have been equally important for past champions. Mr. Schreiber still manages to make the article interesting by adding depth and insight to a simple table. He can find history and anectdotes in the driest of data.

Statistical analysis can support Mr. Schreiber's conclusion. A paired t-test and sign test are two simple ways to compare data. I copied the data into Excel and then loaded it into R. I added two more columns to the data table. One labelled "Difference," which is Offensive Rank minus Defensive Rank. The second column I labelled "Sign" and it assigned a 1 for values of Difference that were positive and 0 for Differences that were not. To perform a paired t-test the data has to be normally distributed. A Schapiro test for normality (R command>schapiro.test(Difference))on Difference produced the following result:

Shapiro-Wilk normality test
data: Difference W = 0.9682, p-value = 0.5107

Typically we look for a p-value <>qqnorm(Difference) >qqline(Difference)) and histogram test (R command>hist(Difference)) confirm this.




The pictures show a roughly normal distribution. A t-test can be used. Simply type t.test(Offensive.Rank,Defensive.Rank,paired=T) and R does the rest of the work:

Paired t-test
data: Offensive.Rank and Defensive.Rank t = 0.1257, df = 28, p-value = 0.9009

alternative hypothesis: true difference in means is not equal to 0 95 percent
confidence interval: -2.637672 2.982499
sample estimates: mean of the differences 0.1724138

With such a high p-value, we fail to reject the null hypothesis. There is no proof that Offensive or Defensive Rank has been statistically more significant for prior champions.
Another possible test is the non-parametric sign test. This test may be preferable to the t-test in this instance since the data we are working with are ranks rather than continuous variables. Like many non-parametric tests, the sign test has fewer necessary conditions and does not require the data to be normally distributed. A simple binomial test can be used. A binomial test works much the same way that the binomial probability function does. A binomial probability function calculates the chances that there will be k successes in n trials. For example, the binomial probability function can tell you the chance of getting 2 heads in 10 trials. To use binomial, the outcome of the test must only have two outcomes. Heads or tails, success or failure, infected or not infected, ale or bad beer and so on.

So how can we apply the principals of binomial probability to our data? That is the purpose of the Sign column explained earlier. All Differences > 0 were labelled as successes and assigned a value of 1. Running the binomial test in R is simple > binom.test(14,27) where 14 is the number of successes and 27 is the total number of trials (29 trials - the 2 trials where the Differences = 0). I typed binom.test(sum(Sign),length(Sign)-2) which will make sense as you familiarize yourself with R. The result:
Exact binomial test
data: sum(Sign) and length(Sign) - 2

number of successes = 14, number of trials = 27, p-value = 1
alternative hypothesis: true probability of success is not equal to 0.5
95 percent confidence interval: 0.3194965 0.7133275
sample estimates:probability of success 0.5185185
So such an event as our data would occur roughly 51.8% of the time. There is no reason to reject the null hypothesis. Once again, the data does not support the claim that defense has been more important to past champions than offense. How many success would we need to get a p-value less than .05? 頑張って R-さん。 The command is > qbinom(.975,27,.5), which returns a value of 19. 19 successes would be needed for there to be statistically significant difference between Offensive and Defensive ranks. For the heck of it here is a box plot and summary of the data.



Sunday, July 20, 2008

The Basics of Poker

No post last week. The author was a bit worried that someone had give Martin another box of silver, but in fact it was the something quite different. Enough personal stuff, onto the math.

Say one was to go play a game of Texas Hold'em. What should the player know? Blackjack and craps have fairly simple strategies to follow to maximize expected play, or rather to minimize expected losses. For blackjack all someone has to do is remember a simple table. Poker is different. The basic requirement is to know the rules. What hands beat which hands, the order in which people bet, and generally how the game unfolds. Once the basics are learned, the two most important concepts are the Fundamental Theorem of Poker and bluffing. The fundamental theorem dictates that a player play his hand as if he could see everyone's cards and always applied the correct pot odds. Every time a player does this he increases his expected gain. Every time a player fails to do this, he is losing money.

Bluffing adds depth to poker. A player cannot follow the fundamental theorem perfectly. He can only guess what his opponents have. If a player plays his hand based on the fundamental theorem and never bluffs, his opponents will be able to accurately guess what cards he has and put him at a disadvantage. The skill in poker comes from balancing the concepts of the fundamental theorem and bluffing.

Those are the basics. Next it is important to get a sense for how often certain hands occur. Playing repeated hands of poker will give a player a good feel. A player gets two cards before he has to decide whether to play the hand or fold. What types of hands should a player ante up for and how often do those hands occur? Let us say a player likes to play the following types of hands: a pair, cards of the same suit that are adjacent (suited connectors), and two high cards (two cards that come from the set of 10, Jack, Queen, King, and Ace). Those are generally regarded as good hands to ante up on as they can lead to strong hands. Let the sets be defined as A for pairs, B for suited connectors, and C for high cards. The probability of getting one of those hands is the union of those three sets:

P(A U B U C) = P(A) + P(B) + P(C) -[P(AB) + P(AC) + P(BC)] + P(ABC)

Where 'U' indicates a union of sets and 'AB, ABC etc.' indicates an intersection of sets. For unions of sets, the basic rule is to add the sets of odd intersections and subtract the even intersections.

P(A) = probability of getting a pair. The first card can be anything. For a pair to occur the second card had to be one of the 3 remaining cards from the deck of 51. The first card in P(B) can be any card. For any card, there are two remaining cards that are suited connectors. If the first card was the Ace of Spades, the second card has to be the King or 2 of Spades. P(C) can only have a 10, Jack, Queen, King or Ace for its first card and second card. There are 20 such cards in the deck and 19 remaining after the first one has been dealt.
P(A) = (52/52)*(3/51)
P(B) = (52/52)*(2/51)
P(C) = (20/52)*(19/51)

The intersection P(ABC) and P(AB) cannot occur since two cards cannot be both a pair and suited connectors. P(AC) is the set of cards that are pairs of 10, Jack, Queen, King, or Ace. There are 13 different types of cards and P(AC) makes up 5 of those types (10, 10; Jack, Jack etc). P(AC) can be expressed as the portion of A that falls into C. Likewise with P(BC) with the caveat that only 4/13 types of suited connectors of B occur in C (since 10, 9 and Ace, 2 do not occur in C).
P(AC) = P(A)*(5/13)
P(BC) = P(B)*(4/13)

Plugging those into the formula:
P(A U B U C) = 20.6%
Thus a player that plays these types of hands will play roughly one of every five hands. To keep opponents from getting a clear read on what type of hands you prefer it may be advisable to play a junk hand occasionally.

I like playing suited connectors and pairs because if the next five cards improve your hand, a clear advantage can emerge. For example, if you play a suited connector and three of the next five cards are the same suit you get a flush. Unless there is a pair among those five cards, a flush will very likely be the best hand. But given that you played a pair or suited connector, what are the odds that the next five cards will improve your hand? I will attempt this question in the next post.

Sunday, July 6, 2008

The Gauntlet Revisited

An earlier post described how a player in the TV show The Gauntlet III could calculate his odds of winning a Gauntlet and who he should select as an opponent. This post will explore team strategy, in particular why the men of the Veteran team decided to purposely lose team challenges.

If a team loses a challenge, two of its members have a duel in the Gauntlet. The losing player has to leave the show. After an unknown number of Gauntlets, the show has one final challenge in which the two teams compete for $300,000. The winning team divides the $300,000 equally between the remaining members. Going into the team challenges, the participants know whether it is a "guys'" or "girls'" day. On guys' days two men from the losing team duel in the Gauntlet. On girls' days two women compete. On guys' days the women face no punishment (i.e. the Gauntlet) for losing, and the men face no punishment for losing on girls' days. To entice the opposite sex to compet on their respective days, the show offers a prize to the winning team. On girls' days, men of the winning team each get a prize of roughly $500. However, if a team loses then one of its members will leave the show, meaning a larger portion of the $300,000 grand prize will go to those who remain.

First let's take a look at how much each remaining member stands to gain when a teammate loses in the Gauntlet. The left column is the number of people remaining on the team, the middle column is how much each member will receive if the team wins the final challenge, and the right column is how much more a player gets when a teammate leaves the show.

Teams start with 16 people and every time someone leaves, the remaining members can potentially win more prize money. The 'Difference' column shows that the more people leave, the more the rest stand to gain. When the first person leaves, everyone stands to win $1,250 more. When the 10th person leaves, the team members can win over $7,000 more. For example, in the final challenge the Veteran team stood to win roughly $30,000 each and the smaller Rookie team $60,000.

Let's take a look at the normal form representation of a single stage of this game. Let us assume that it is a girls' day and we are looking at the representation from the men's view. Assume that the men think they have a 50% possibility of winning the final challenge, thus they stand to gain $625 if they lose the challenge and a teammate (1250*.5 = 625). The men from both teams have the same strategy set: Try or Shirk. A team that chooses to Try will win 100% of the time against a Shirking team. Or perhaps more accurately, the Shirking team can make sure they lose with 100% certainty. If both teams Try they each have a 50% chance of winning, if both teams Shirk they each have a 50% chance of winning.



Thus Shirk dominates Try, since the chance of $1250 ($625) is greater than the $500 prize for winning the challenge. In the next stage game, the team that lost a member has a chance to win even more than $625, since the figures in the Difference column grow as more people leave the show. Using the binomial theorem one can calculate how much present value one gains by having the chance to lose more teammates. In the first stage matrix, the $625 jumps to over $3,000 in present value. The question is not why the Veteran men decided to Shirk challenges, but why didn't they start Shirking earlier?

The payoff matrix shows that the $500 prize is no deterrent. Are there other deterrents to Shirking? The final challenge determines the grand prize. If teams with more members had an advantage in that challenge, then that would be a deterrent to Shirking. However the format of the final challenge, common knowledge to the players because of previous versions of the show, does not favor large teams. It favors teams that do not have weak links and ones with strong athletes. Many Veteran men purposely Shirked because they thought it would improve their chances of winning the final challenge. The women realized that their probability of winning the final challenge would decrease if they lost strong athletes and were less likely to Shirk. A third deterrent might be an emotional reason. Perhaps pride, a competitive nature or a connection to members of the opposite sex motivated players not to Shirk. That might explain some of the romantic relationships.

The only tool the women have to deter Shirking is threat of reciprocal punishment. "If you Shirk and send one of us to the Gauntlet, we will Shirk and send one of you to the Gauntlet tomorrow." This strategy is undermined since losing too many athletic members hurts everyone on the team and because the remaining women also benefit from losing weaker members. By Shirking and losing strong men, the women hurt themselves. By losing the remaining women, they stand to make more money, and have a better chance of winning the final challenge. The Veteran men chose to Shirk, and the women had little recourse.

If both teams' men had realized the dominance of Shirk, there would have been some interesting repercussions. Both teams' men would be trying to lose on girls' days. To try to deter this, both teams' women might play the 'Mad President' strategy (act irrational-not a stretch for this group) and try to Shirk, the final challenge be damned. The Nash Equilibrium would probably involve an agreement between the men and women on each team with both sides agreeing that a certain amount of Shirking, to lose the weaker male and female competitors, is ideal and to only Shirk strategically in order to maximize the probability of winning the final challenge.

Sunday, June 29, 2008

Late Goals in Euro 2008

While watching Spain defeat Russia in the semi-finals of the Euro 2008 soccer tournament, a stat flashed on the screen. 23 out of the 79 goals in the tournament were scored after the 75th minute. Disregarding overtime and stoppage time, the 75th to 90th minute is about 1/6 of the game. So one would presume that 1/6 (roughly 17%) of the goals would occur in that interval. But in fact 23/79, roughly 30%, of the goals were scored then. Is this statistically significant?

Let p^ = the sample statistic (23/79) and p(o) = the expected population statistic (1/6). Let the α =.05 be the threshold. If the discovered p-value is < .05 we reject the null hypothesis. The null hypothesis is that p^ = p(o). For this let's use a One-proportion z-test.


At this point in the tournament there had been 29 games, so let n = 29. p^ = p(hat) (I am unsure how to type hats or subscripts), p(o) = p and z-score = z. Plug the values into the formula to get the z-score... the z-score = 1.79, making the p-value < .05 so we reject the null hypothesis.

Does it make sense that such a high percentage of goals would be scored in the final 15 minutes? It brings to mind an analysis I read about the US presidential elections. The author asserts that the trailing candidate should pursue the "pull the goalie strategy" used in hockey. The trailing hockey team is so desperate to score, the coach pulls the goalie and puts a better goal score into the game. By pulling the goalie the team expects to have more chances to score but will certainly be easier to score upon. The team that is trailing wants to utilize a strategy that increases the overall variance at the expense of the optimal strategy. Over the long-term this high variance strategy is not as effective as the optimal strategy, but the trailing team is not playing for the long term. It is playing for the short term. The trailing team will use a high variance strategy, resulting in it and the opposition scoring more goals.

This was evident today in the Euro 2008 finals. Germany pursued the high variance strategy in the final 15 minutes at the expense of the optimal strategy. The German team had more chances to score, but at the same time allowed the Spanish side some great opportunities.

Which Half is the What Now?

The New York Times posted an article on the questionable efficacy of cardiac CT scans. There is a suprising quote from the website of doctor who supports CT scans. "Half of Americans have died of heart attacks and strokes. Which one are you?" This statement is absurd. Rather than dissect it gramatically, I am unsure why the NYTimes would include it, it can be taken apart with logic and algebra. There are about 300 million Americans alive right now. Let x equal the amount of Americans who have died of heart attacks or strokes (I assume the doctor meant 'or' (the union of two sets) not 'and' (the intersection)) and let y equal the amount of Americans who have died of anything else. Making x + y = the total amount of Americans who have died. From that quote we know that:

x / (x+y+3.0*10^6) = 1/2
2x = x + y +3.0*10^6
x - 3.0*10^6 = y

This tells us that x > 3.0*10^6 and x > y, assuming that x + y > 0. The percentage of dead Americans who have died of heart attacks or strokes is equal to x/(x+y). Since x > y, x/(x+y) > 1/2. For example, suppose that 500 million Americans have died. Plugging that number into the formula:

x/(5.0*10^6+3.0*10^6) = 1/2
x = 4.0*10^6
x / (x+y) = % of dead Americans who have died of heart attacks or strokes
4.0*10^6 / 5.0*10^6 = .8 or 80% of dead Americans have died of heart attacks or strokes

I am unaware of what twisted game the doctor is playing. He is trying to emphasize the risk of heart attacks and strokes with the "1/2 comment", when he could quote the x / (x+y) = % of Americans who have died of heart attacks or strokes. The latter is larger than 1/2. A mystery of a man who is trying to shock the public and undersell the problem at the same time.

On a less mathematical note, there are worse things than heart attacks or strokes to have as the leading cause of death in a society. I am reading the book Chances Are... . I am surprised by how short the life expectancy for Londoners was in prior centuries and how resistant to statistical studies the medical community was. Unfortunately, as the NYTimes article details, there are still resistance to evidence based medicine.

Friday, June 20, 2008

When to Foul the Shooter

The conventional basketball wisdom is to foul the shooter rather than give up an easy lay-up. If a shooter has an easy shot, say a shot he makes 95% of the time, it is better to foul him and let him attempt two free throws rather than take the easy shot. As long as he does not make over 95% of his free throws (which almost nobody does), the defensive team will allow less expected points. If the offensive player only makes 50% of his free throws, fouling him saves .9 expected points. 2*.95 - (1*.5+1*.5) = .9 . Fouling the league average shooter, who makes 75.2% of his free throws saves about .4 points. Fouling has another detriment that is not accounted for in that math. Once a team has commited five fouls, the other team goes to the free thow line for every subsequent defensive or loose ball foul, regardless of whether the player was in the act of shooting.

During the 2006-2007 NBA season, teams scored about 1.1 points per possession and made 75.5% of free throws. If the offensive team is in the bonus, a non-shooting foul results in the defense allowing .43 more expected points than they do on an average possession. 2*.752 - 1.1 = .43. It is better to let the possession elapse without commiting a non-shooting foul. The defensive team would rather not be in the bonus. Should you still foul the shooter on an easy shot?

Before estimating the additional penalty that committing a foul detracts, here are some statistics:


Free throw % and the other stats needed to calculate Points Per Possession and Mean Fouls were found here. Points Per Possession was found by taking the league average Points Per Game, and dividing by league average 'Pace' statistic. PPG/Pace = PPP. I calculated Mean Fouls by taking the league average minutes per season, divided by 5 to make the stat minutes per team, and divided it by the league average fouls per team to get team fouls per minute. I multiplied it by 12 (minutes in a quarter) to get fouls per quarter. (Minutes/5)/(Fouls per Team) * 12 = FPQ.

I will take an extreme case to see if there is an instance where it would be better to let the offensive player have any easy shot, rather than foul. Suppose that the offensive player has an easy shot at the start of the quarter. Should he be fouled? The Poisson Distrubution, can be used to show how often a team reaches a certain number of fouls.

In this case λ = 5.51 (average fouls per quarter) and k = fouls in a quarter. On the first line of the table below is the random variable k, ranging from 0 to 14. The percentage below it the Poisson probability of that exact amount of fouls happening in a quarter. For example, the most likely outcome is a team commiting 5 fouls in a quarter, which happens 17.1% of the time. Below that number is 'Points Lost.' Points Lost is the amount of points the defense loses by fouling a player and letting him shoot free throws. As shown above, by fouling the defense allows .43 more points than they would if it did not foul. Multiplying it by the Poisson probability gives the points lost. For example, by fouling 5 times in a quarter, on the 5th foul the team goes to the free throw line and gets .43 more points. This happens in 17.1% of quarters. The penalty increases as the team fouls more often. Fouling 6 times a quarter send the opposing team to the free throw line on two occassions, allowing the offensive team to get .86 more points per quarter. The formula is .43*Poisson%*(k-4) = Points Lost. 4 being used because every time a team commits k > 4 fouls the opposing team goes to the free throw line k-4 times. Total Points Lost is the sum of expected Points Lost for each value of k.



To return to the extreme case, fouling at the beginning of the quarter has the effect of the offensive team needing to draw only 4 more fouls (rather than 5) in order to shoot free throws. Compared to the previous example, this increases the Total Points Lost by the defense. Committing 4 fouls results in Points Lost and the penalty for committing more fouls increases. For this case, Points Lost = .43*Poisson%*(k-3).

If the difference between Total Points Lost (Early Foul) and Total Points Lost (normal case)increases by more than .4 points, the amount of points the defense saves by fouling the league average player on an easy shot, then the defense would be better off not fouling. The table below shows the comparison:



The difference in Total Points Lost is .343 which is < .4. Although the defense's Total Points Lost increases with the early foul it does not increase enough to justify letting a player have an easy shot. Thus the conventional wisdom is reaffirmed for the league average player; foul him rather than let him have an easy shot. If Total Points Lost had been more than .4, the next step would have been to calculate a more accurate Total Points Lost by accounting for offensive fouls (which do not result in free throws) and shooting fouls (which always result in free throws). However that is not needed and the reader will never learn that 9.8% of all fouls committed during the 2006-2007 NBA season were offensive fouls.

While all I did was reaffirm the conventional wisdom, I believe the data is suggests that there may be special cases when the defense should not commit the early foul. If the offensive player makes easy shots somewhat than 95% of the time and makes free throws somewhat more than 75% of the time, it may be worth not commiting an early foul on him. NBA teams with access to more exact stats would be advised to, especially during 7 game playoff series, calculate some exact figures for specific players to see who and when not to foul.

Sunday, June 15, 2008

Bringing Math to the Gauntlet

Real World/Road Rules Challenge: The Gauntlet III is a reality show that I embarrisingly happen to enjoy. Two teams compete in random challenges. The losing team has to participate in the Gauntlet, a one-on-one duel where the winner gets to stay on the show and the loser leaves the show. The winning team selects the first player to enter the duel and the losing team chooses his opponent. Often the losing team would let the first player pick who he would face in the duel. I hope to show how a player can choose an opponent that will give him the highest probability of winning the Gauntlet.

To determine which game the players will play in the Gauntlet, a wheel with six outcomes (five different games and a spin again) is spun. Since all five games have an equal chance of being selected, Player 1 can calculate his odds by estimating his subjective probability of winning in each individual game, summing the probabilities and dividing by five (the number of games). For example, if Player 1 believes that he is evenly match with his opponent in all five games, his chances of winning are 50%, shown by the formula Pv:

Pv (Probability of Victory) = (.5 + .5 + .5 +.5 + .5)/5 = .5 = 50%

The games are varied and certain games favor certain types of players. It is unlikely that a player would estimate himself as evenly matched with his opponent in all five games. Force Field-basically tug of war with pulleys-favors body strength and weight. Ankle Breakers-reverse tug of war with a rope tieing a player to his opponents ankle-and Ram it Home-a shoving match of sorts also favor strength and body weight. Sliders is a puzzle game that does not require athletic ability. Ball Brawl-a race to grab and carry a ball across a goal line-gives the advantage to the faster player. Let's do another example. This time Player 1 can select a weaker, smaller Player 2, who is faster and smarter (one would assume giving Player 2 an advantage in the puzzle game) than Player 1. Player 1 calculates the games in his favor of giving him a 70% chance of winning and only 30% for the games that favor Player 2.

Pv = (.7 + .7 + .7 +.3 + .3)/5 = .54 = 54%

Another wrinkle. Ball Brawl is a repeated stage game. The winner is the first who scores 4 points. Repeated stage games, compared to a single elimination games, favor the team with a higher probability of winning. Much like the 7 game series of the NBA playoffs favor the better team more than the NCAA college basketball tournament does. If Player 1 believes that his chances of scoring in each stage game of Ball Brawl is 30%, his overall odds of winning the game drop to roughly 18%. The logic being it is easier for Player 1 to convert one 30% chance than it is to convert multiple 30% chances. The actual math can be calculated using the binomial theorem:

The winning player needs to score 4 points. There are five stage games in which to score points. In the first three stages, grabbing a ball and returning it over the goal line is worth 1 point. In the last two stages, a successful score is worth two points. The winning player needs to score 2 or 3 points in the first three stages and 2 points in the final two stages or to score 4 points in the final two stages. Say the probability of scoring in a stage game is .3. The function b(x) can be created to calculate the chance of winning ball brawl. b(.3) =


= .181 = 18% =

Or in Excel:

=BINOMDIST(2,2,.3,FALSE)+BINOMDIST(1,2,.3,FALSE)*(BINOMDIST(3,3,.3,FALSE)+BINOMDIST(2,3,.3,FALSE))

BINOMDIST(s, n, p, false) where s = number of successes, n = number of trials, p = probability of success for a trial, false = not cumulative

Formally, Pv = (p1 + p2 + p3 + p4 + b(p5)) / 5

where p1 = estimated probability of winning game 1, p2 = game 2, ..., p5 = prob of winning a stage in ball brawl

So what should Player 1 look for in an opponent? Since three games emphasize strength and body weight, Player 1's first criteria is to choose a weaker opponent. After that, the repeated stages of Ball Brawl, and the advantage it gives to the faster player, dictate choosing a slower opponent. The last criteria to evaluate is your potential opponent's intelligence. Big players will pick on small players and small players will choose weaker, slower, and or less intelligent small players.

More formally, the dominant strategy is to pick a person i ∈ S (set of of all players) for Player 2 such that Pv(i) ≥ Pv(j) for all players j ∈ S.

While I would be surprised if anybody on the Gauntlet reasoned his or her opponent selection out to this degree, it does explain why a player like Eric survived to the end of the competition. Eric was not suited for the final competition, but his sizable body weight and strength advantage made him an opponent no one wanted to face in the Gauntlet.