diceroll-visualization

Logo

Using matplotlib and seaborn to graph the probabilities of the faces of a die in a specified number of rolls.

View the Project on GitHub adviksinghania/diceroll-visualization

Dice-Roll-Visualization

Using Matplotlib and Seaborn to graph the probabilities of the faces of a die in a specified number of rolls.

About

Law of Large Numbers: In probability theory, the law of large numbers (LLN) is a theorem that describes the result of performing the same experiment a large number of times. According to the law, the average of the results obtained from a large number of trials should be close to the expected value and will tend to become closer to the expected value as more trials are performed.

We use a fair dice as an example to demonstrate this law. All the faces of the die are equally likely to occur but initially, when the number of die rolls is small, the frequencies of the faces is uneven and the probability may be more for one face than others. When we roll the die for a large number of times, we see that the result indeed approaches the expected value 1/6 (= 0.166666) or 16.667%.

Initial script

We create a script that rolls a die (generates a random number between 1 to 6), a specific number of times, e.g. 1000, 5000 then the frequency of each face should be 1000/6 or 5000/6 respectively. Let’s take 6,000,000 for simplicity. Then the frequency for each face should be 1,000,000.

# !/bin/python3
# roll_die.py
"""Roll a six-sided die 6,000,000 times."""

import random

# number of die rolls
num_roll = 100
# face frequency counters
frequency = [0 for i in range(6)]

# 6,000,000 die rolls
for roll in range(num_roll):  # note underscore separators
    # random value in range 1 to 6 (face of the die)
    face = random.randrange(1, 7)
    # increment appropriate face counter
    frequency[face - 1] += 1

# output: displaying frequency for each face
print(f'Face{"Frequency":>13}')
for i, j in enumerate(frequency):
    print(f'{i + 1:>4}{j:>13}')

Running the above script would display an output like:

$ python3 roll_die.py                                              
Face    Frequency
   1           15
   2           10
   3           16
   4           22
   5           13
   6           24

Here, the probability is maximum for the face ‘6’ (24%) and minimum for the face ‘2’ (10%). Which is not true because all the face of the die are equally likely to be the outcome.

If the same script were to run by changing the value of num_roll from 100 to 6,000,000; the output would be:

$ python3 roll_die.py          
Face    Frequency
   1      1000104
   2       999846
   3      1000645
   4       999434
   5      1001315
   6       998656

We can see that the frequency for each face is almost equal to 1,000,000 and therefore the probability for each face is almost 16.67%. Depending upon your system, this script would take time to execute since we are generating random numbers, 6 million times. Now, we’ll create a script to visualize the distribution using matplotlib and seaborn.

Setup

Static visualization

Now, run the roll_die_plot.py in your Linux terminal to create a static graph of the frequency distribution.

Example 1:

$ python roll_die_plot.py 100

Figure 1

Face |   Frequency | Probability
   1 |          14 |     14.000%
   2 |          19 |     19.000%
   3 |          14 |     14.000%
   4 |          25 |     25.000%
   5 |          16 |     16.000%
   6 |          12 |     12.000%


Example 2:

$ python roll_die_plot.py 6000000

Figure 2

Face |   Frequency | Probability
   1 |      999521 |     16.659%
   2 |      999198 |     16.653%
   3 |      998664 |     16.644%
   4 |     1002321 |     16.705%
   5 |      999494 |     16.658%
   6 |     1000802 |     16.680%


Dynamic Visualization

In this script, we use matplotlib animation’s FuncAnimation function which updates the bar plot dynamically.
We’ll have to give two command line arguments to this script:

Example:

python roll_die_dynamic.py 300 20

This will run the script for 300 frames, doing 20 die rolls per frame, for 30 milliseconds per frame (33 FPS) and the update function will be called 6000 times (20 rolls per frame * 300 frames).

Example