Geometry

I have read many technical blogs that clarified my understanding of some topics in the past. Some notable shoutouts are this visual information theory breakdown, and this post on Maxwell's demon. They are both very thorough and I recommend you read them.

I think technical blogs are an essential source of knowledge, because in order for someone to write them, they need to understand deeply (and empirically) whatever they're trying to explain to you. Plus, they are also written in a way that you're supposed to understand, unlike papers. I encourage you to write one of your own!

This brief write-up is my first attempt to contribute back by illustrating some interesting geometrical properties of binomial events I found through a statistics class I'm taking this semester.

Picture a fair coin. There are two outcomes: heads or tails, each with a 50% chance, respectively.

If you wanted to know the chances that this coin lands on heads N times consecutively, and each flip is independent, you can just multiply:

P (heads N times) = (\frac{1}{2})^{N}

Now, you probably already know these formulas if you took a probability class in college or high school.

What you might not have learned there is that geometry is a powerful way of describing probability, specifically when partitioning a set A with all the outcomes of interest.

3Blue1Brown's video on Bayes' theorem does a great job of visualizing how one can consider all possible outcomes (in their example, the profession of a group of people) as a rectangle that can be broken down into overlapping sections from which Bayes' theorem can be derived. You can watch it here.

Applying this to the coin toss question, we can visualize all the possible heads and tails sequences ('HHHTTT', 'HTHTHT', 'HHHHHH', etc) one may get after tossing a coin six times with a similar visualization: a sample space. A sample space is a visualization of all the possible discrete outcomes or paths that may happen for a given experiment. In this example, coin toss sequences.

Sample Space for 6 Consecutive [Independent] Coin Tosses

Show visualization code

import matplotlib
import matplotlib.pyplot as plt
import numpy as np
from pathlib import Path

NUM_FLIPS = 6
ASSETS_DIR = Path(__file__).resolve().parent.parent / "figures"

LOCALE_FONTS = {
    'ja': 'Hiragino Sans',
    'ko': 'Apple SD Gothic Neo',
    'zh': 'Songti SC',
    'hi': 'Kohinoor Devanagari',
}

LOCALE_LABELS = {
    'en': {
        'xlabel': 'Coin Flip Number',
        'ylabel': 'Probability space (0–1)',
        'title': 'Sample Space Division: 6 Coin Flips\n(Total outcomes = 2^6 = 64)',
        'H': 'H', 'T': 'T',
    },
    'es': {
        'xlabel': 'Número de lanzamiento',
        'ylabel': 'Espacio de probabilidad (0–1)',
        'title': 'División del espacio muestral: 6 lanzamientos\n(Resultados totales = 2^6 = 64)',
        'H': 'C', 'T': 'S',
    },
    'de': {
        'xlabel': 'Münzwurf-Nummer',
        'ylabel': 'Wahrscheinlichkeitsraum (0–1)',
        'title': 'Stichprobenraum-Aufteilung: 6 Münzwürfe\n(Gesamtergebnisse = 2^6 = 64)',
        'H': 'K', 'T': 'Z',
    },
    'fr': {
        'xlabel': 'Numéro de lancer',
        'ylabel': 'Espace de probabilité (0–1)',
        'title': "Division de l'espace des résultats : 6 lancers\n(Résultats totaux = 2^6 = 64)",
        'H': 'F', 'T': 'P',
    },
    'pt': {
        'xlabel': 'Número do lançamento',
        'ylabel': 'Espaço de probabilidade (0–1)',
        'title': 'Divisão do espaço amostral: 6 lançamentos\n(Resultados totais = 2^6 = 64)',
        'H': 'C', 'T': 'R',
    },
    'pl': {
        'xlabel': 'Numer rzutu',
        'ylabel': 'Przestrzeń prawdopodobieństwa (0–1)',
        'title': 'Podział przestrzeni próbkowania: 6 rzutów\n(Łączne wyniki = 2^6 = 64)',
        'H': 'O', 'T': 'R',
    },
    'ja': {
        'xlabel': 'コイン投げ回数',
        'ylabel': '確率空間 (0–1)',
        'title': '標本空間の分割：6回のコイン投げ\n（総結果数 = 2^6 = 64）',
        'H': '表', 'T': '裏',
    },
    'ko': {
        'xlabel': '동전 던지기 횟수',
        'ylabel': '확률 공간 (0–1)',
        'title': '표본 공간 분할: 동전 6번 던지기\n（총 경우의 수 = 2^6 = 64）',
        'H': '앞', 'T': '뒤',
    },
    'hi': {
        'xlabel': 'सिक्का उछाल संख्या',
        'ylabel': 'प्रायिकता स्थान (0–1)',
        'title': 'नमूना स्थान विभाजन: 6 सिक्का उछाल\n(कुल परिणाम = 2^6 = 64)',
        'H': 'च', 'T': 'प',
    },
    'zh': {
        'xlabel': '硬币投掷次数',
        'ylabel': '概率空间 (0–1)',
        'title': '样本空间划分：6次硬币投掷\n（总结果数 = 2^6 = 64）',
        'H': '正', 'T': '反',
    },
}


def outcome_to_sequence(outcome, length, h_char, t_char):
    if length == 0:
        return ''
    bits = format(outcome, f'0{length}b')
    return ''.join(h_char if bit == '1' else t_char for bit in bits)


def render(locale, theme, labels):
    is_light = theme == 'light'
    bg = 'white' if is_light else 'black'
    fg = 'black' if is_light else 'white'

    fig, ax = plt.subplots(figsize=(14, 8), facecolor=bg)
    ax.set_facecolor(bg)

    colors = plt.cm.Blues(np.linspace(0.3, 0.9, NUM_FLIPS + 1))
    h_char = labels['H']
    t_char = labels['T']

    for flip in range(NUM_FLIPS + 1):
        total_outcomes = 2**flip
        height = 1 / total_outcomes

        for outcome in range(total_outcomes):
            y_position = outcome * height
            rect = plt.Rectangle((flip, y_position), 0.8, height,
                                  linewidth=1, edgecolor=bg,
                                  facecolor=colors[flip], alpha=0.7)
            ax.add_patch(rect)

            if flip <= 5:
                label = outcome_to_sequence(outcome, flip, h_char, t_char)
                ax.text(flip + 0.4, y_position + height / 2, label,
                        ha='center', va='center', fontsize=8)

    ax.set_xlim(-0.5, NUM_FLIPS + 0.5)
    ax.set_ylim(0, 1)
    ax.set_xlabel(labels['xlabel'], fontsize=12, color=fg)
    ax.set_ylabel(labels['ylabel'], fontsize=12, color=fg)
    ax.set_title(labels['title'], fontsize=14, color=fg)
    ax.set_xticks(range(NUM_FLIPS + 1))
    ax.grid(axis='x', alpha=0.3, color=fg)
    ax.tick_params(colors=fg)
    for spine in ax.spines.values():
        spine.set_color(fg)

    out_dir = ASSETS_DIR / locale
    out_dir.mkdir(exist_ok=True)
    suffix = '-light' if is_light else ''
    plt.tight_layout()
    plt.savefig(out_dir / f'coin-sample-space-n-6{suffix}.png', dpi=150, facecolor=bg)
    plt.close(fig)


def generate(locale, theme):
    font = LOCALE_FONTS.get(locale, 'DejaVu Sans')
    with matplotlib.rc_context({'font.family': font}):
        render(locale, theme, LOCALE_LABELS[locale])


def main():
    for locale in LOCALE_LABELS:
        for theme in ['dark', 'light']:
            print(f'  {locale}/{theme}')
            generate(locale, theme)


main()

The first block on the left shows the 0th coin flip. Since it has a guaranteed probability of happening, it occupies the whole Y-axis. The second block shows the outcome of the first coin toss. Since it can be either a head or a tail, it occupies two evenly spaced blocks.

Further to the right, each block doubles in its number of outcomes, which makes sense intuitively because we're expanding each coin toss with 2 children outcomes respectively.

The Y-axis gives the probability for each single path to happen according to its proportion. Paths along the center — with an even variation between heads and tails — occupy a bigger part of the Y-axis, as naturally, they're the most probable, while paths of consecutive heads or tails (either going fully down or up, in a staircase pattern), occupy an ever-decreasing proportion of the Y-axis, relative to their very low probabilities as shown below.

Consecutive tails or heads shaded staircase.

Show visualization code

import matplotlib
import matplotlib.pyplot as plt
import numpy as np
from pathlib import Path

NUM_FLIPS = 6
ASSETS_DIR = Path(__file__).resolve().parent.parent / "figures"

LOCALE_FONTS = {
    'ja': 'Hiragino Sans',
    'ko': 'Apple SD Gothic Neo',
    'zh': 'Songti SC',
    'hi': 'Kohinoor Devanagari',
}

LOCALE_LABELS = {
    'en': {
        'xlabel': 'Coin Flip Number',
        'ylabel': 'Probability space (0–1)',
        'title': 'Sample Space Division: 6 Coin Flips\nShowing Shaded Staircases of Consecutive H/T Patterns.',
        'H': 'H', 'T': 'T',
    },
    'es': {
        'xlabel': 'Número de lanzamiento',
        'ylabel': 'Espacio de probabilidad (0–1)',
        'title': 'División del espacio muestral: 6 lanzamientos\nEscaleras sombreadas de patrones consecutivos C/S.',
        'H': 'C', 'T': 'S',
    },
    'de': {
        'xlabel': 'Münzwurf-Nummer',
        'ylabel': 'Wahrscheinlichkeitsraum (0–1)',
        'title': 'Stichprobenraum-Aufteilung: 6 Münzwürfe\nSchattierte Treppen aufeinanderfolgender K/Z-Muster.',
        'H': 'K', 'T': 'Z',
    },
    'fr': {
        'xlabel': 'Numéro de lancer',
        'ylabel': 'Espace de probabilité (0–1)',
        'title': "Division de l'espace des résultats : 6 lancers\nEscaliers ombrés des motifs consécutifs F/P.",
        'H': 'F', 'T': 'P',
    },
    'pt': {
        'xlabel': 'Número do lançamento',
        'ylabel': 'Espaço de probabilidade (0–1)',
        'title': 'Divisão do espaço amostral: 6 lançamentos\nDegraus sombreados de padrões consecutivos C/R.',
        'H': 'C', 'T': 'R',
    },
    'pl': {
        'xlabel': 'Numer rzutu',
        'ylabel': 'Przestrzeń prawdopodobieństwa (0–1)',
        'title': 'Podział przestrzeni próbkowania: 6 rzutów\nZacieniowane schody kolejnych wzorców O/R.',
        'H': 'O', 'T': 'R',
    },
    'ja': {
        'xlabel': 'コイン投げ回数',
        'ylabel': '確率空間 (0–1)',
        'title': '標本空間の分割：6回のコイン投げ\n連続する表/裏パターンの階段（網掛け）',
        'H': '表', 'T': '裏',
    },
    'ko': {
        'xlabel': '동전 던지기 횟수',
        'ylabel': '확률 공간 (0–1)',
        'title': '표본 공간 분할: 동전 6번 던지기\n연속 앞/뒤 패턴의 음영 계단.',
        'H': '앞', 'T': '뒤',
    },
    'hi': {
        'xlabel': 'सिक्का उछाल संख्या',
        'ylabel': 'प्रायिकता स्थान (0–1)',
        'title': 'नमूना स्थान विभाजन: 6 सिक्का उछाल\nक्रमागत च/प पैटर्न की छायांकित सीढ़ियाँ।',
        'H': 'च', 'T': 'प',
    },
    'zh': {
        'xlabel': '硬币投掷次数',
        'ylabel': '概率空间 (0–1)',
        'title': '样本空间划分：6次硬币投掷\n连续正/反面模式的阴影阶梯。',
        'H': '正', 'T': '反',
    },
}


def outcome_to_sequence(outcome, length, h_char, t_char):
    if length == 0:
        return ''
    bits = format(outcome, f'0{length}b')
    return ''.join(h_char if bit == '1' else t_char for bit in bits)


def render(locale, theme, labels):
    is_light = theme == 'light'
    bg = 'white' if is_light else 'black'
    fg = 'black' if is_light else 'white'

    fig, ax = plt.subplots(figsize=(14, 8), facecolor=bg)
    ax.set_facecolor(bg)

    colors = plt.cm.Blues(np.linspace(0.3, 0.9, NUM_FLIPS + 1))
    h_char = labels['H']
    t_char = labels['T']

    for flip in range(NUM_FLIPS + 1):
        total_outcomes = 2**flip
        height = 1 / total_outcomes

        for outcome in range(total_outcomes):
            y_position = outcome * height
            is_staircase = (outcome == 0 or outcome == total_outcomes - 1)

            facecolor = colors[-1] if is_staircase else colors[flip]
            alpha = 1.0 if is_staircase else 0.15
            edgecolor = fg if is_staircase else bg

            rect = plt.Rectangle((flip, y_position), 0.8, height,
                                  linewidth=1, edgecolor=edgecolor,
                                  facecolor=facecolor, alpha=alpha)
            ax.add_patch(rect)

            if flip <= 5:
                label = outcome_to_sequence(outcome, flip, h_char, t_char)
                ax.text(flip + 0.4, y_position + height / 2, label,
                        ha='center', va='center', fontsize=8, color=fg)

    ax.set_xlim(-0.5, NUM_FLIPS + 0.5)
    ax.set_ylim(0, 1)
    ax.set_xlabel(labels['xlabel'], fontsize=12, color=fg)
    ax.set_ylabel(labels['ylabel'], fontsize=12, color=fg)
    ax.set_title(labels['title'], fontsize=14, color=fg)
    ax.set_xticks(range(NUM_FLIPS + 1))
    ax.grid(axis='x', alpha=0.3, color=fg)
    ax.tick_params(colors=fg)
    for spine in ax.spines.values():
        spine.set_color(fg)

    out_dir = ASSETS_DIR / locale
    out_dir.mkdir(exist_ok=True)
    suffix = '-light' if is_light else ''
    plt.tight_layout()
    plt.savefig(out_dir / f'coin-sample-space-shaded-staircase{suffix}.png', dpi=150, facecolor=bg)
    plt.close(fig)


def generate(locale, theme):
    font = LOCALE_FONTS.get(locale, 'DejaVu Sans')
    with matplotlib.rc_context({'font.family': font}):
        render(locale, theme, LOCALE_LABELS[locale])


def main():
    for locale in LOCALE_LABELS:
        for theme in ['dark', 'light']:
            print(f'  {locale}/{theme}')
            generate(locale, theme)


main()

Are you starting to see the link between geometry and probability here? You could measure the probability for any given sequence by picking its terminal block and measuring its height. It makes questions like how likely are you to get a specific sequence like 'HTHHTHH' easy to answer; just trace the graph!

So why is this important? As you may recall from the law of large numbers, over enough runs, outcomes average out to their true probabilities. If you flip a fair coin forever and count how many times you get heads or tails, the chance of getting either becomes 50% and 50%. The above graph doesn't really make this intuitive, though. After all, it seems like the most likely sequences simply collapse into blobs of increasing length.

But something is hiding in plain sight: a normal distribution. If we plot the proportion of heads in all possible outcomes from 1 to 100 flips as shown below, there is a clear bell shaped distribution before the distribution around the mean at 50% converges on it.¹

Proportion of Heads in the Coin Sample Space Over N=0 to N=100

Show visualization code

import matplotlib
import matplotlib.pyplot as plt
import numpy as np
from pathlib import Path
from scipy.special import comb

NUM_FLIPS = 100
ASSETS_DIR = Path(__file__).resolve().parent.parent / "figures"

LOCALE_FONTS = {
    'ja': 'Hiragino Sans',
    'ko': 'Apple SD Gothic Neo',
    'zh': 'Songti SC',
    'hi': 'Kohinoor Devanagari',
}

LOCALE_LABELS = {
    'en': {
        'xlabel': 'Number of Coin Flips (n)',
        'ylabel': 'Proportion of Heads',
        'title': 'Proportion of Heads in the Sample Space\nA Normal Distribution Emerges Naturally',
        'legend': 'Mean (50% heads)',
        'colorbar': 'Relative Probability Density',
    },
    'es': {
        'xlabel': 'Número de lanzamientos (n)',
        'ylabel': 'Proporción de caras',
        'title': 'Proporción de caras en el espacio muestral\nUna distribución normal emerge naturalmente',
        'legend': 'Media (50% caras)',
        'colorbar': 'Densidad de probabilidad relativa',
    },
    'de': {
        'xlabel': 'Anzahl der Münzwürfe (n)',
        'ylabel': 'Anteil der Köpfe',
        'title': 'Anteil der Köpfe im Stichprobenraum\nEine Normalverteilung entsteht natürlich',
        'legend': 'Mittelwert (50% Köpfe)',
        'colorbar': 'Relative Wahrscheinlichkeitsdichte',
    },
    'fr': {
        'xlabel': 'Nombre de lancers (n)',
        'ylabel': 'Proportion de faces',
        'title': "Proportion de faces dans l'espace des résultats\nUne distribution normale émerge naturellement",
        'legend': 'Moyenne (50% faces)',
        'colorbar': 'Densité de probabilité relative',
    },
    'pt': {
        'xlabel': 'Número de lançamentos (n)',
        'ylabel': 'Proporção de caras',
        'title': 'Proporção de caras no espaço amostral\nUma distribuição normal emerge naturalmente',
        'legend': 'Média (50% caras)',
        'colorbar': 'Densidade de probabilidade relativa',
    },
    'pl': {
        'xlabel': 'Liczba rzutów (n)',
        'ylabel': 'Odsetek orłów',
        'title': 'Odsetek orłów w przestrzeni próbkowania\nNaturalnie pojawia się rozkład normalny',
        'legend': 'Średnia (50% orłów)',
        'colorbar': 'Względna gęstość prawdopodobieństwa',
    },
    'ja': {
        'xlabel': 'コイン投げ回数 (n)',
        'ylabel': '表の割合',
        'title': '標本空間における表の割合\n正規分布が自然に現れる',
        'legend': '平均（表 50%）',
        'colorbar': '相対的確率密度',
    },
    'ko': {
        'xlabel': '동전 던지기 횟수 (n)',
        'ylabel': '앞면의 비율',
        'title': '표본 공간에서 앞면의 비율\n정규분포가 자연스럽게 나타남',
        'legend': '평균 (앞면 50%)',
        'colorbar': '상대적 확률 밀도',
    },
    'hi': {
        'xlabel': 'सिक्का उछाल की संख्या (n)',
        'ylabel': 'चित का अनुपात',
        'title': 'नमूना स्थान में चित का अनुपात\nएक सामान्य वितरण स्वाभाविक रूप से उभरता है',
        'legend': 'माध्य (50% चित)',
        'colorbar': 'सापेक्ष प्रायिकता घनत्व',
    },
    'zh': {
        'xlabel': '硬币投掷次数 (n)',
        'ylabel': '正面比例',
        'title': '样本空间中正面的比例\n正态分布自然涌现',
        'legend': '均值（正面 50%）',
        'colorbar': '相对概率密度',
    },
}


def render(locale, theme, labels):
    is_light = theme == 'light'
    bg = 'white' if is_light else 'black'
    fg = 'black' if is_light else 'white'

    fig, ax = plt.subplots(figsize=(20, 14), facecolor=bg)
    ax.set_facecolor(bg)

    for flip in range(1, NUM_FLIPS + 1):
        max_prob = comb(flip, flip // 2, exact=True) * (0.5 ** flip)
        marker_size = max(5, 200 / np.sqrt(flip))

        for num_heads in range(flip + 1):
            prob = comb(flip, num_heads, exact=True) * (0.5 ** flip)
            relative_prob = prob / max_prob
            color = plt.cm.RdYlBu_r(relative_prob)
            y_proportion = num_heads / flip

            ax.scatter(flip, y_proportion, c=[color], s=marker_size,
                       marker='s', edgecolors='none', alpha=0.95, rasterized=True)

    ax.axhline(y=0.5, color=fg, linestyle='--', linewidth=3, alpha=0.9,
               label=labels['legend'], zorder=10)

    for flip in [10, 25, 50, 75, 100]:
        std_dev = np.sqrt(flip * 0.5 * 0.5)
        std_proportion = std_dev / flip
        ax.plot([flip, flip], [0.5 - std_proportion, 0.5 + std_proportion],
                'yellow', linewidth=5, alpha=0.8, zorder=9)
        ax.plot([flip, flip], [0.5 - 2 * std_proportion, 0.5 + 2 * std_proportion],
                'orange', linewidth=3, alpha=0.6, zorder=8)

    ax.set_xlim(0, NUM_FLIPS + 2)
    ax.set_ylim(0, 1)
    ax.set_xlabel(labels['xlabel'], fontsize=16, fontweight='bold', color=fg)
    ax.set_ylabel(labels['ylabel'], fontsize=16, fontweight='bold', color=fg)
    ax.set_title(labels['title'], fontsize=17, fontweight='bold', pad=20, color=fg)

    sm = plt.cm.ScalarMappable(cmap=plt.cm.RdYlBu_r, norm=plt.Normalize(vmin=0, vmax=1))
    sm.set_array([])
    cbar = plt.colorbar(sm, ax=ax, pad=0.01, fraction=0.046)
    cbar.set_label(labels['colorbar'], fontsize=14, fontweight='bold', color=fg)
    cbar.ax.yaxis.set_tick_params(color=fg)
    plt.setp(plt.getp(cbar.ax.axes, 'yticklabels'), color=fg)

    ax.legend(loc='upper right', fontsize=13, framealpha=0.9)
    ax.tick_params(colors=fg)
    for spine in ax.spines.values():
        spine.set_color(fg)

    out_dir = ASSETS_DIR / locale
    out_dir.mkdir(exist_ok=True)
    suffix = '-light' if is_light else ''
    plt.tight_layout()
    plt.savefig(out_dir / f'coin-sample-space-proportion-heads{suffix}.png',
                dpi=150, bbox_inches='tight', facecolor=bg)
    plt.close(fig)


def generate(locale, theme):
    font = LOCALE_FONTS.get(locale, 'DejaVu Sans')
    with matplotlib.rc_context({'font.family': font}):
        render(locale, theme, LOCALE_LABELS[locale])


def main():
    for locale in LOCALE_LABELS:
        for theme in ['dark', 'light']:
            print(f'  {locale}/{theme}')
            generate(locale, theme)


main()

Now, we can see that the proportion will indeed even out to 50%, and if we expanded this to many more N we'd end up seeing a completely straight line.²

What's interesting is that at N < 20 the distribution of outcomes is mostly normal. The yellow lines act as a visual aid: if we stopped sampling proportions at those lines, the center would follow the central limit theorem, accumulating most outcomes and smoothing out to a bell, while the very edges show two types of outcomes: at first, either getting a head or not, and at a very low probability, consecutively not getting any head or consecutively getting all heads.

We can explain this intuitively by looking at the previous figure overlaid and mirrored on the sample space distribution plot:

Overlay of the two previous plots with the sample space mirrored on top of proportion sample space.

Show visualization code

import shutil
from pathlib import Path

import matplotlib.pyplot as plt
import numpy as np
from PIL import Image
from scipy.special import comb

NUM_FLIPS = 100
STAIRCASE_FLIPS = 6
ALPHA_SCALE = 0.5
ASSETS_DIR = Path(__file__).resolve().parent.parent / "figures"

ALL_LOCALES = ['en', 'es', 'de', 'fr', 'pt', 'pl', 'ja', 'ko', 'hi', 'zh']


def generate_distribution_base(output_path, theme):
    is_light = theme == 'light'
    bg = 'white' if is_light else 'black'

    fig, ax = plt.subplots(figsize=(20, 14), facecolor=bg)
    ax.set_facecolor(bg)

    for flip in range(1, NUM_FLIPS + 1):
        max_prob = comb(flip, flip // 2, exact=True) * (0.5 ** flip)
        marker_size = max(5, 200 / np.sqrt(flip))

        for num_heads in range(flip + 1):
            prob = comb(flip, num_heads, exact=True) * (0.5 ** flip)
            relative_prob = prob / max_prob
            color = plt.cm.RdYlBu_r(relative_prob)
            y_proportion = num_heads / flip

            ax.scatter(flip, y_proportion, c=[color], s=marker_size,
                       marker='s', edgecolors='none', alpha=0.95, rasterized=True)

    ax.set_xlim(0, NUM_FLIPS + 2)
    ax.set_ylim(0, 1)
    ax.set_xticks([])
    ax.set_yticks([])
    for spine in ax.spines.values():
        spine.set_visible(False)

    plt.tight_layout()
    plt.savefig(output_path, dpi=150, bbox_inches='tight', facecolor=bg)
    plt.close(fig)


def generate_staircase_base(output_path, theme):
    is_light = theme == 'light'
    bg = 'white' if is_light else 'black'
    fg = 'black' if is_light else 'white'

    fig, ax = plt.subplots(figsize=(14, 8), facecolor=bg)
    ax.set_facecolor(bg)

    colors = plt.cm.Blues(np.linspace(0.3, 0.9, STAIRCASE_FLIPS + 1))

    for flip in range(STAIRCASE_FLIPS + 1):
        total_outcomes = 2**flip
        height = 1 / total_outcomes

        for outcome in range(total_outcomes):
            y_position = outcome * height
            is_staircase = (outcome == 0 or outcome == total_outcomes - 1)

            facecolor = colors[-1] if is_staircase else colors[flip]
            alpha = 1.0 if is_staircase else 0.15
            edgecolor = fg if is_staircase else bg

            rect = plt.Rectangle((flip, y_position), 0.8, height,
                                  linewidth=1, edgecolor=edgecolor,
                                  facecolor=facecolor, alpha=alpha)
            ax.add_patch(rect)

    ax.set_xlim(-0.5, STAIRCASE_FLIPS + 0.5)
    ax.set_ylim(0, 1)
    ax.set_xticks([])
    ax.set_yticks([])
    for spine in ax.spines.values():
        spine.set_visible(False)

    plt.tight_layout()
    plt.savefig(output_path, dpi=150, bbox_inches='tight', facecolor=bg)
    plt.close(fig)


def overlay_images(base_path, overlay_path, output_path):
    if not base_path.exists():
        print(f'Missing base image: {base_path}')
        return
    if not overlay_path.exists():
        print(f'Missing overlay image: {overlay_path}')
        return

    base = Image.open(base_path).convert('RGBA')
    overlay = Image.open(overlay_path).convert('RGBA')

    overlay_flipped = overlay.transpose(Image.FLIP_LEFT_RIGHT)
    overlay_resized = overlay_flipped.resize(base.size, Image.LANCZOS)

    r, g, b, a = overlay_resized.split()
    a = a.point(lambda v: int(v * ALPHA_SCALE))
    overlay_transparent = Image.merge('RGBA', (r, g, b, a))

    combined = Image.alpha_composite(base, overlay_transparent)
    combined.save(output_path)


def main():
    en_dir = ASSETS_DIR / 'en'
    en_dir.mkdir(exist_ok=True)

    for theme in ['dark', 'light']:
        suffix = '-light' if theme == 'light' else ''
        print(f'  generating overlay intermediates ({theme})')

        clean_base = en_dir / f'coin-sample-space-proportion-heads-clean{suffix}.png'
        clean_staircase = en_dir / f'coin-sample-space-shaded-staircase-clean{suffix}.png'
        overlay_out = en_dir / f'coin-sample-space-overlay{suffix}.png'

        generate_distribution_base(clean_base, theme)
        generate_staircase_base(clean_staircase, theme)
        overlay_images(clean_base, clean_staircase, overlay_out)

        for locale in ALL_LOCALES:
            if locale == 'en':
                continue
            locale_dir = ASSETS_DIR / locale
            locale_dir.mkdir(exist_ok=True)
            dest = locale_dir / f'coin-sample-space-overlay{suffix}.png'
            shutil.copy(overlay_out, dest)
            print(f'  copied overlay to {locale}/{theme}')


main()

The extreme edges of the bell line up with consecutive coin paths, which are very unlikely, while the majority of outcomes that occupy the biggest proportion along the center of the sample space (where heads appears 50% of the time) correspondingly line up with the center of the bell, showing that the Central Limit Theorem emerges geometrically from the branching structure of a binomial sample space.

Another way to think about this is that if you re-scale the sample space to show the least likely sequences to be smaller and the most likely to be bigger, the proportion of times that heads appears is distributed close to normally [discounting for the gap between consecutive sequences and roughly normal sequences] until the law of large numbers kicks in and averages out to 50%. Zooming out from the coin-toss example, this is applicable to any binomial process of independent events.

Is this news? Perhaps not. But it is very cool to see how probability can be translated into areas and proportions, which can in turn reveal the underlying probability distribution for a specific outcome in a visual, geometric manner.

I used Perplexity with Claude 4.5 Sonnet to work through my ideas.³ Originally, I began making the above graphs and experiments during class when I got carried away trying to solve a problem by drawing boxes, which led to me seeing the staircase pattern (in the context of rigged coins), and digging deeper into the math. It walked me through how this nicely ties up with Pascal's triangle:

What we've visualized here is actually a rotated form of Pascal's triangle—a well-established mathematical structure where each number is the sum of the two numbers above it. The connection runs deep: at each flip number N, counting how many paths lead to exactly k heads gives you the entries in row N of Pascal's triangle. These are the binomial coefficients C(N,k), which represent the number of ways to choose k items from N. When we divide each row by 2^N to convert counts into probabilities, we get the binomial distribution. The Central Limit Theorem then guarantees that as N grows large, this distribution approaches the normal curve—which is exactly what we see in our visualization. The geometry of branching paths through sample space doesn't just resemble Pascal's triangle; it is Pascal's triangle, revealing why the bell curve emerges so naturally from repeated binary trials.

In hindsight, it's a very intuitive conclusion, as the count of events in each column of the coin sample space matches the sum of the Nth row in Pascal's triangle.

If you found this useful or interesting, I'm glad! Do consider writing a technical blog of your own. In an AI-dominated era, mindful writing is more valuable than ever. It is also very useful to enrich your own understanding of things.

I promise I did not go out of my way to write this blog post just to show the visualization of the proportion of heads. It is extremely cool, and reminds me a bit of slope fields. ↩
Only later, when explaining the law of large numbers to someone, I realized this graph illustrates the same idea, except that the curves here are smooth and do not show the variability you would normally want in a graph meant to convey convergence. To do that, you'd show many coin flip sequences rather than just the theoretical proportions. ↩
If you're curious about the aforementioned chat I had with Claude with my thinking process, you can read it here. ↩

Geometry

Footnotes