AUGUST 5, 2011 9:05AM

Spike’s Deadliest Warrior: Are the winners true victors?

Figure 1

Segments most liked by viewers: gel torso used for  testing a weapon

Source: SpikeTV 

Spike TV's The Deadliest Warrior, now in its 3rd season, is a television program that creates hypothetical battles between famed historical or current fighters (or 'warriors', as they put it) to see which one would be the 'deadliest'. In the past, this has included such awesome (and occasionally bizarre) matchups like the American Green Berets versus the Soviet Spetsnaz, the Nazi SS versus the Viet Cong, Al Capone and his cronies versus Jesse James and his gang (my wife's favorite), and a pirate versus a medieval knight. (The knight lost.) Later this season, we'll see a blood thirsty battle between Vampires and Zombies. Yes, Team Edward will go up against the shambling undead. I'm rooting for the Zombies.

This probably sounds a lot more like a drunken argument after a few too many rounds of Halo, but to give Spike credit, the winner of any give matchup is determined by what folks in my line of work call simulation analysis. What that means, basically, is that every potential feature  each 'warrior' would bring to the fight is examined, then turned into numerical percentages and number-crunched via a computer program that then simulates a thousand different battles. The Deadliest Warrior is of course the side who wins most often based on the comparison of the data.

In the first and second seasons this really just amounted to a bunch of gleefully fake blood-soaked weapons testing, which was awesome. In the third and current season, however, other dimensions were added to the simulation taking into account several 'X-factors' (technically these are actually input variables, but I guess X-factors sounds cooler), such as terrain, the warriors' tactical knowledge, discipline, psychological impact of the weapons used, the warriors' size, strength and overall health, etc. All told, there are about 100 of these X-Factors (input variables).

Now if they would just cut out the irritating trash talk between the guest combat experts, the show would be just about perfect.

As I mentioned above, the key dénouement of the shows consists of simulating a series of battles between the two opponents (using the computer program aka a black box), and we get to see who wins via a fictional battle between the two warriors (played by actors and stunt people). I love how obvious it is that all the final battle segments are filmed in California (even though the warriors featured in the episode lived say in Asia or the Arabian Peninsula).

In order to find out Who Is Deadliest? (The catch phrase of the show and definitely said in ALL CAPS), battles are simulated 5,000 times (4,000 more times than in Seasons 1 & 2). Then, the results are tabulated and the warrior who wins the most battles is declared the ultimate victor. So far, this season's victors are the ones in bold:

George Washington  vs.   Napoleon Bonaparte
2,530 (50.6%)                             2,470 (49.4%)

Joan of Arc                      vs.   William the Conqueror
2,587 (51.74%)                           2,413 (48.26%)

U.S. Army Rangers    vs.   North Korean Special Operation Force
2,504 (50.08%)                          2,496 (49.92%)       

Figure 2

Season 3 Episode 1 - George Washington trying to motivate his troups

Source: SpikeTV 

But if you look carefully at the stats, the pairing results are very, very close. In fact, the greatest difference is about 3.5-percent. As someone who uses statistics all the time in my research, I can tell you that the difference in percentages between all but the Joan of Arc versus William the Conqueror is actually statistically insignificant. In other words, most of the time both warriors are deadliest.

The reason for this has to do with basic stats, but when I tried to explain it to my darling wife her brain exploded. If you don't know much statistics, I'm afraid you'll have to take my word for it. For the more statistically inclined, I've laid out Spike's simulation problem below:

It all comes down to an extremely important item that's missing from the sims: the uncertainty associated with the simulation output.

As you know, if we can already be certain about the outcome of a given system, we wouldn't have to conduct a simulation study in the first place. Thus, simulation is used to give us an idea about how a system (in this case, the warriors) would behave when we don't know all the characteristics of the input variables that influence the simulation outcome. The simulation output always provides a mean value and the variance associated with this estimated value. The variance and the standard deviation (the square-root of the variance) give the measure of uncertainty.

Even with the 100 X-factors, the simulation output should provide the variance associated with the estimate. In truth, these input factors should be random variables (with a mean and variance) in order to properly simulate the outcome of the battles (using a Monte Carlo simulation protocol1). According to what I've seen on the show, these X-factors are subjective values provided by the hosts and other experts. This means that the simulation output could be biased (as suggested by the fact that American fighters have only lost the simulated battles once in three seasons).

Given the lack of information about the uncertainty or variability associated with the simulation process, we can still use an approximation method (described in the next paragraph) to examine whether the differences between the percentages is statistically significant. In other words, we'll be able to determine with great confidence whether or not the Zombies kick the Vampires to the curb. I guess we'll see if what I foresee will happen this coming September.

In statistics, we can use the t-statistic test to compare whether two percentages are statistically different. A t-statistic above 1.96 combined with a p-value below 0.05 (or 5%) show that the two percentages are different (at an acceptable level of uncertainty). Note that we can use different values for the t-statistic to measure different levels of uncertainty, but I'll leave that for another day.

So, what does this tell us about the Season 3 episodes so far?

Accounting for the uncertainty (probably under-estimated for the test I'm using), we get the following results when we compare the percentages:

George Washington vs. Napoleon Bonaparte
t-statistic=1.20, p-value=0.23
Joan of Arc* vs. William the Conqueror
t-statistic=3.48, p-value=0.0005

U.S. Army Rangers vs. North Korean Special Operation Force
t-statistic=0.16, p-value=0.88

Thus, based on the results above, only Joan of Arc appears to be a clear winner. For the other two, no one can claim to be Deadliest. Oops.

Maybe the combatants are too evenly matched. On the other hand, if the results are too different, the viewers might complain that the outcome is so obvious that there is no need to have them 'fight to the death'.

Uncertainty, after all, makes things more interesting. Especially if you're into stats.

 An example of the carnage often seen on The Deadliest Warrior

1For a true estimate of the variability, the 5,000 simulation runs should be performed say 100 or 1,000 times with different input values. Then, we summarize all the percentages in order to get the mean (average) and the variance.  

Thanks to Taste is Sweet for her input.

I came across the show on Netflix, on the "instant play" service. I saw the Green Beret vs. Spetsnaz show. It seemed to me that too much emphasis was placed on the individual and his training and weapons, and not enough emphasis on systemic components such as intelligence, logistics, communications, and so on.

Surely both soldiers are capable and well-equipped. But the Green Beret is going to have an entirely different view of the battlefield, with live information. He'll be able to call upon cruise missile strikes from UAVs and offshore submarines. He'll have close ground support from A-10s, and air support from aircraft that can shoot down everything else in the sky. And that's just the start.

Add in such other factors, and there's no comparison.
Hey Mishima666: You’re absolutely right about the simulation is heavily soldier-centric. This has been a major complaint the first two seasons. Now, they have a new simulator (based on a gaming system – weird), which takes into account more input variables. Although not perfect, it makes more realistic “battles.”

I have to say that watching this show is one guilty pleasure of mine. However, I like this season better since they also have well-known historians who discuss various aspects of the combatants that are not well-known to the general public. This part is very informative. Sometimes, I even like their "aftermath" internet show even better, since there is no trash talking and the experts often expand on the topic they were initially talking on the main show.