Statistical Analysis of Stat Distribution

Statistical Analysis of RotMG Character Stat Distributions
OtherBill, Ph.D.
2014 Jan 08

For a tl;dr (too long, didn’t read) or a th;dm (too hard, didn’t math), feel free to skip to the tables here and here

Abstract

When players roll new characters, different players have different criteria of what makes a “good” roll. Some are happy with any above-average hp, some want +10hp or better. In addition, when the game gives a player a truly outstanding roll, the player might wonder just how rare that roll might be. This thread hopes to address these questions.

Introduction

In RotMG, all new characters (level 1) begin with the same character stats. Upon each level-up, these stats increase randomly; the most interesting of these is HP, which is increased by a uniformly-random value between 20 and 30 (inclusive). The end result is that all unpotted level 20 characters collectively have character stats that approximate a “bell curve” normal distribution.

For example, when you roll a single six-sided die, all six possible results are equally likely:

When you roll two six-sided dice and add their values, there are more ways to generate the numbers in the middle of the range of possible results. For example, there is only one way to generate the result of 2 (1+1) while there are six ways to generate the result of 7 (1+6, 2+5, 3+4, 4+3, 5+2, 6+1):

When you roll three six-sided dice, this effect continues: the numbers in the middle of the range become more likely and the numbers near the ends of the range become more uncommon. Here is the standard “3d6” statistical distribution that tabletop gamers all know and love:

Most of the results here fall into the range of 8-13, but very low results (3-4) and very high results (17-18) are still possible (but very uncommon).

We can further this effect by rolling even more dice. What happens when we roll, say, twenty dice?

This is the classic “bell curve” shape of what is known in mathematics as the “normal distribution”. In this case, most of the results fall between 60 and 80, and very low (<40) or very high (>100) results, while theoretically possible, happen so infrequently that they practically never happen.

Applying This To RotMG: HP

Every time a RotMG character levels up, its HP increases by a random value between 20 and 30 (inclusive). We can interpret this as “the character’s HP always increases by 19, plus the value returned by rolling one 11-sided die”. Since a character levels up 19 times between lvl1 and lvl20, the character’s HP always increases by at least 361 (19*19), plus the value returned by rolling 19 11-sided dice:

This illustrates the distribution of HP rolls of unpotted lvl20 characters in RotMG (but shifted to the left, as this shows only the random component).

Statistical Modeling of Rolling Dice

When dealing with statistical distributions, there are three important terms that are relevant here. The mean of a statistical distribution is the “average” of all the possible results, accounting for frequency of each possible result. The standard deviation and the variance are both measures of how often random results will be close to the mean: distributions with small variances will have “tall” and “skinny” bell curves, while distributions with large variances will have “wide” and “short” bell curves. For example, compare the blue, red, and gold curves in the following diagram:

The variance and the standard deviation are related: the standard deviation is simply the square root of the [variance]. Working with standard deviations is often preferrable because it is in the same units as the distribution itself. In fact, 68.2% of all results will fall within one standard deviation away from the mean, 95.4% of all results will fall within two standard deviations, and so on:

Let’s apply this to rolling dice.

When rolling a single die, the mean is the sum of all possible values divided by the number of sides. For a six-sided die, the mean will be (1+2+3+4+5+6)/6 = 3.5

When rolling a single die, the variance is “the mean of the squares, minus the square of the mean”. For a six-sided die, the mean of the squares is (1+4+9+16+25+36)/6 = 15.166… and the square of the mean is (3.5*3.5) = 12.25, so the variance is (15.166…-12.25) = 2.9166…

When rolling n identical dice, the mean and the variance are both simply multiplied by n. So when rolling 20 six-sided dice, the mean will be (20*3.5) = 70, and the variance will be (20*2.9166…) = 58.33… This gives the following normal distribution:

This looks identical to the “20d6” example of the previous post (the only exception is that values below 45 and above 95 are so rare that WolframAlpha doesn’t even bother to show them).

Applying this to RotMG: HP

When rolling an “11-sided die”, the mean is (1+2+3+4+5+6+7+8+9+10+11)/11 = 6 and the resulting variance is (1+4+9+16+25+36+49+64+81+100+121)/11-(6*6) = 10. When rolling 19 “11-sided dice”, the resulting mean is (19*6) = 114 and the resulting variance is (19*10) = 190.

However, this is only the random component of the HP roll–it doesn’t account for the constant component. The robe classes begin with 100 HP and gain a constant 19 HP every time they level up. This means all level 20 characters have at least 100+(19*19) = 461 HP, plus the random component shown above. As a result, the robe classes have an HP distribution with a mean of (461+114) = 575 HP and a variance of 190:

The other classes have identical HP distributions, but the curves are shifted to the right to reflect their additional HP at level 1 (105 HP, 125 HP, or 200 HP).

To interpret this graph, the total area under the curve represents the entire RotMG population of unpotted level 20 characters. For a given HP value:

  1. the height of the curve at that point represents how common that exact roll is,
  2. the area under the curve to the left of that point represents the entire RotMG population who have worse rolls, and
  3. the area under the curve to the right of that point represents the entire RotMG population who have better rolls.

2 and 3 can be casually compared to see just how good a particular HP roll might be.

For example, let’s look at 595 HP. For a robe wearer, this is a +20 HP roll:

The probability of getting exactly 595 HP is about 0.01, or around only 1%. The area to the right of this line is pretty small, so the odds of getting a better roll are, also, pretty small–this guy’s a keeper. The area to the left looks to be about ten times bigger than the area on the right, so this roll is probably in the top 10% (one-tenth) of all possible rolls. (This is just an estimate; we’ll directly calculate this in the next section.)

For a dagger class, on the other hand, this same 595 HP is a -5 HP roll:

As you can see, the probability of getting this exact roll is considerably higher with the dagger classes. It’s below average, but the areas on either side of the line are pretty close to being equal, so if you want to discard this character and reroll, you have a pretty good chance that your next roll will be even worse. If you would have been happy with a +0 HP roll, it might not be worth the time to reroll here.

Applying this to RotMG: MP

Skipping the math here (the math is left as an exercise for the reader), the MP distribution of the “spellcasting” classes has a mean of 290 MP and a variance of 190:

and the MP distributions of the other classes has a mean of 195 MP and a variance of 76:

Interpretation: Standard Deviations and HP

Given the HP distribution shown here, the standard deviation is sqrt(190) = 13.78 HP.

This means that a +14 HP roll (one standard deviation away) is better than 84.1% of all HP rolls. A +28 HP roll (two standard deviations away) is better than 97.7% of all HP rolls. A +41 HP roll (three standard deviations away) is better than 99.8% of all HP rolls–truly a “one in 500” possibility.

For other values, we need to use the “cumulative distribution function” of the normal distribution. When you get a good roll, its percentile rank (the percentage of all rolls that are worse than it) can be calculated from the formula:

P(X ≤ n) = (1/2) erfc((-n)/(2 sqrt(95)))

where “erfc” is the “complementary error function”. Plugging in some interesting values here:

HP roll Better than Rough odds of occurrence
+0 50.0% one in two
+5 64.2% one in three
+10 76.6% one in four
+15 86.2% one in seven
+20 92.7% one in thirteen
+25 96.5% one in twenty
+30 98.5% one in seventy
+35 99.4% one in 200
+40 99.8% one in 500
+45 99.95% one in 2000
+50 99.98% one in 5000
+55 99.9967% one in 30000
+60 99.9993% one in 150000

Truly, if you want a “one in a million” HP roll, you’ll need to aim for +66 HP. Assuming 15 minutes per roll and 24/7 gameplay, this will take over fourteen years to achieve (on average).

Interpretation: Standard Deviations and MP

Granted, that +10hp roll might still be garbage if it’s -25mp, so let’s account for the MP distributions as well.

For spellcasters, the variance of 190 matches the variance of the HP distributions, so the table above can be used for their MP rolls as well.

For other classes, the variance of 76 yields the following formula:

P(X<=n) = (1/2) erfc((-n)/(2 sqrt(38)))

which gives the following table:

MP roll Better than Rough odds of occurrence
+0 50.0% one in two
+5 71.7% one in three
+10 87.4% one in eight
+15 95.7% one in twenty
+20 98.9% one in 100
+30 99.97% one in 3000
+40 99.99977% one in 450000

A “one in a million” MP roll for a non-spellcaster would be +42 MP.

Application

Everybody approaches rolling differently. This data can be used to help you calculate how good that +20/+10 roll really is (it’s about a “one in 100” roll), or decide how long it would take to roll better than that +10/-25 klunker (not too long).

Taking This to Eleven: Direct Calculation

One problem with the above math is that the normal distribution is only an approximation of what’s really going on (granted, it’s a very, very good one, but it’s an approximation nonetheless). The biggest discrepancy is that normal distributions are continuous while rolling/summing dice is discrete–after all, you can’t roll two dice that sum to 6.2.

I can just hear you all now: “But OB, you can’t get a +0.8hp roll! Why approximate at all? I want you to directly calculate all of this! I want to know exactly how many ways there are to generate a +3hp roll!!

To those people, all I can say is: Let it never be said that OB doesn’t deliver.

When you roll a character, every level-up gives 11 different possibilities for +HP. Since you level up 19 times, there are 11^19 (that’s 61,159,090,448,414,546,291) different possibilities…so let’s account for each and every one of them.

Binomial Expansion and Pascal’s Triangle

Let’s flip an equally-weighted coin–two possible outcomes. There’s one way to get heads and one way to get tails. Let’s think of this as {1,1}.

Let’s flip it twice–four possibilities. There’s one way to get two heads, two different ways to get a head and a tail (heads then tails, and tails then heads), and one way to get two tails. Let’s think of this as {1,2,1}.

Let’s flip it three times–eight possibilities. There’s one way to get three heads. There are three different ways to get two heads and a tail. There are three different ways to get one head and two tails. Finally, there’s one way to get three tails. Again, {1,3,3,1}.

Like magic, these rows of numbers just happen to match the rows in Pascal’s Triangle:

Once you make this connection, it’s easy to make statements like “if you flip the coin seven times, there are 21 different ways to get two heads and five tails”.

Coincidentally, this array is also used to get the coefficients for expansion of the binomial (x+y)^n for a constant n. For example, (x+y)^3 = 1 x^3 + 3 x^2y + 3 xy^2 + 1 y^3. (Note that the coefficients, when summed, add up to 2^n. This will be important later.)

n-omial Expansion and Really Frickin’ Big Numbers

However, what if we’re not flipping coins? What if we’re rolling 11-sided dice? Since we’re dealing with sums of numbers, it’s easy to let WolframAlpha handle it for us.

Let’s look at HP for the robe classes, again. We begin with 100 HP, every time we level up we gain between 20 HP and 30 HP, and we level up 19 times. So let’s plug the following in to WolframAlpha:

x^100 * (x^20+x^21+x^22+x^23+x^24+x^25+x^26+x^27+x^28+x^29 +x^30)^19

WolframAlpha merrily returns a huge expansion:

x^670+19 x^669+190 x^668+1330 x^667+7315 x^666+33649 x^665+134596 x^664+480700 x^663+1562275 x^662+4686825 x^661+13123110 x^660+34597271 x^659+86492864 x^658+206249465 x^657+471410330 x^656+1037019335 x^655+ ... +

... +

... [Seriously, a lot of stuff is cut out here. OB] ... +

... +

... +1037019335 x^495+471410330 x^494+206249465 x^493+86492864 x^492+34597271 x^491+13123110 x^490+4686825 x^489+1562275 x^488+480700 x^487+134596 x^486+33649 x^485+7315 x^484+1330 x^483+190 x^482+19 x^481+x^480

We don’t actually care about the formula here. Instead, this was just an easy way to get WolframAlpha to calculate all the coefficients for us.

Looking back at the “flipping coins” example, we can interpret the expansion as follows: out of all of those 11^19 possibilities, the “1 x^670” means there’s only one way to end up with 670 HP. “19 x^669” means there are 19 ways to end up with 669 HP. The term right before the cut means there are 1,037,019,335 ways to end up with 655 HP. And to answer the earlier question you were asking about, there are exactly 1,716,012,478,586,176,034 ways to get that +3hp roll (not shown in the box above).

Once you capture all of those coefficients and verify that they collectively all add up to 11^19 (here’s a hint: they do), then you can plug all of that into a spreadsheet and calculate percentiles.

Interpretation

+0 HP rolls happen about 2.87% of the time. Even though +0 HP is statistically considered to be “50th percentile”, it turns out that +0 HP is actually better than only 48.56% of all rolls. This discrepancy occurs because there are so many different ways to get +0 HP, and they’re all the same in the long run.

So, let’s rebuild the above tables, using data crunched by WolframAlpha and Excel:

HP/spellcaster MP roll Better than
+0 48.56%
+5 62.70%
+10 75.32%
+15 85.24%
+20 92.09%
+25 96.23%
+30 98.42%
+35 99.42%
+40 99.82%
+45 99.95%
+50 99.989%
+55 99.998%
+60 99.9997%
MP (non-spellcaster) roll Better than
+0 47.73%
+5 69.59%
+10 86.10%
+15 95.18%
+20 98.78%
+30 99.97%
+40 99.9999%

An Interesting Question: Knight Moves

When I was discussing this work on #rotmg, somebody posed the following question: “What are the odds of rolling a Knight that already has max att?” After all, the Knight starts with 15 att, gains 1-2 att per level-up, and maxes out at 50 att, so it’s possible to “naturally” get a 1/8 Knight.

So, let’s feed the following into WolframAlpha:

x^15 (x + x^2)^19

and WolframAlpha returns the expansion:

x^53+19 x^52+171 x^51+969 x^50+3876 x^49+11628 x^48+27132 x^47+50388 x^46+75582 x^45+92378 x^44+92378 x^43+75582 x^42+50388 x^41+27132 x^40+11628 x^39+3876 x^38+969 x^37+171 x^36+19 x^35+x^34

Att won’t increase once you hit 50, so there are in fact (1+19+171+969) = 1160 different ways to roll a 50-att Knight, out of (2^19) = 524,288 possibilities. Thus, that max-att Knight should happen (1160/524288) = 0.0022, or 0.22% of the time. This is roughly a “one in 500” happening.

So, while a “natural” 1/8 Knight is theoretically possible, nobody will ever believe you if it happens to you (some long-term players have only heard of this happening 2 or 3 times). If you get one, certainly enjoy it, but don’t get mad if people think you’re trolling.

References

RealmEye, “Realm of the Mad God Wiki”. (http://www.realmeye.com/wiki/realm-of-the-mad-god)

StackExchange, “Probability Distribution of Rolling Multiple Dice”. (http://math.stackexchange.com/questions/406192/probability-distribution-of-rolling-multiple-dice)

Wikipedia, “Normal Distribution”. (http://en.wikipedia.org/wiki/Normal_distribution)

Wikipedia, “Pascal’s Triangle”. (http://en.wikipedia.org/wiki/Pascal%27s_triangle)

Wikipedia, “Standard Deviation”. (http://en.wikipedia.org/wiki/Standard_deviation)

Wolfram Alpha, “Wolfram Alpha”. (http://www.wolframalpha.com)