Skip to contents

Introduction

The fermi package is built around two ideas:

  1. Fermi decompositions of hard problems into simpler parts.
  2. Bayesian updating of beliefs as data arrive.

This vignette focuses on the second part: conjugate updates using:

  • beta_binom_posterior() – updating beliefs about a probability.
  • gamma_poisson_posterior() – updating beliefs about a rate of counts.
  • normal_normal_posterior() – updating beliefs about a mean with known noise.

Each section below explains:

  • When to use the updating functions and why.
  • How to express your prior with fermi and update it with data.

Each example picks a realistic question, encodes a prior based on rough beliefs, updates with data, and inspects the posterior with posterior_summary() and plot_posterior().

1. beta_binom_posterior(): learning a proportion

1.1 When and why

Use the Beta–Binomial model when:

  • The unknown quantity is a probability (e.g. proportion of users who click a button).
  • Your data are counts of successes out of a number of trials.
  • Trials can be treated as independent with the same underlying probability.

Mathematically:

Prior:
pBeta(α,β) p \sim \mathrm{Beta}(\alpha, \beta)

Data:
ypBinomial(n,p) y \mid p \sim \mathrm{Binomial}(n, p)

Posterior:
pyBeta(α+y,β+ny) p \mid y \sim \mathrm{Beta}(\alpha + y,\; \beta + n - y)

The posterior has the same functional form as the prior. That’s what “conjugate” means here.

1.2 Example: Email signup rate on a website

Imagine a website with a “join newsletter” box.

We want to estimate the signup rate pp (the probability that a visitor signs up), and we want to combine:

  • Prior belief (based on other, similar sites).
  • Fresh data from the current site.

1.2.1 Prior: expert Beta distribution

Suppose you believe there’s an 80% chance that the signup rate is between 5% and 20%.

We can encode this as a Beta prior using expert_beta_prior():

signup_prior <- expert_beta_prior(
  q = c(0.05, 0.20), # quantiles
  p = c(0.1, 0.9) # probabilities
)

signup_prior
#> fermi prior
#>   Distribution: beta 
#>   Source      : expert 
#>   Parameters  :
#> $shape1
#> [1] 3.394814
#> 
#> $shape2
#> [1] 25.12697

stats::qbeta(
  c(0.1, 0.9),
  shape1 = signup_prior$params$shape1,
  shape2 = signup_prior$params$shape2
)
#> [1] 0.05000003 0.19999994

1.2.2 Data: observed signups

Now we run a short campaign and see:

  • 200 visitors
  • 23 signups
visitors <- 200
signups <- 23

1.2.3 Posterior: updated Beta using beta_binom_posterior()

signup_post <- beta_binom_posterior(
  prior = signup_prior,
  successes = signups,
  trials = visitors
)

signup_post
#> fermi posterior
#>   Distribution: beta 
#>   Parameters  :
#> $shape1
#> [1] 26.39481
#> 
#> $shape2
#> [1] 202.127
#> 
#>   Derived from prior (expert)

We can summarise the posterior:

posterior_summary(signup_post, cred_level = 0.9)
#> Posterior summary
#>   Credible level: 90 %
#>   Median        : 0.1142519 
#>   CI            : [0.08241409, 0.152479]
#>   HDI           : [0.08090875, 0.1500094]

And visualise it:

plot_posterior(signup_post, cred_level = 0.9)

Interpretation:

  • The posterior median is your updated “best guess” for the signup rate.
  • The credible interval quantifies uncertainty after seeing data and incorporating the prior.
  • If you observe more data, the posterior will become more concentrated.

2. beta_gamma_posterior(): learning a rate of counts

2.1 When and why

Use the Gamma–Poisson model when:

  • The unknown quantity is a rate λ\lambda of events per unit exposure (per day, per month, per person-year, etc.).
  • Data are counts of events, often with an associated exposure (e.g. days observed, people at risk).

Mathematically:

  • Prior: λGamma(α,β)\lambda \sim \mathrm{Gamma}(\alpha, \beta)
  • Data: yiλPoisson(λei)y_i \mid \lambda \sim \mathrm{Poisson}(\lambda \cdot e_i), where eie_i is the exposure for observation ii.
  • Posterior: λyGamma(α+yi,β+ei)\lambda \mid y \sim \mathrm{Gamma}(\alpha + \sum y_i,\; \beta + \sum e_i)

This is useful whenever you’re counting random events over time or space.

2.2 Example: Support tickets per day

Suppose you help manage a support team and want to understand the average number of tickets per day, λ\lambda.

You have some prior belief based on experience, and then you collect data from the first week on a new system.

2.2.1 Prior: Gamma for the ticket rate

Say you believe there’s an 80% chance that there are between 1 and 5 tickets per day.

tickets_prior <- expert_gamma_prior(
  q = c(1, 5),
  p = c(0.1, 0.9)
)

tickets_prior
#> fermi prior
#>   Distribution: gamma 
#>   Source      : expert 
#>   Parameters  :
#> $shape
#> [1] 2.886175
#> 
#> $rate
#> [1] 1.032756

stats::qgamma(
  c(0.1, 0.9),
  shape = tickets_prior$params$shape,
  rate = tickets_prior$params$rate
)
#> [1] 1 5

2.2.2 Data: a week of counts

During the first 7 days, you observe the following counts of support tickets:

tickets_week <- c(3, 4, 2, 6, 5, 4, 3)
tickets_week
#> [1] 3 4 2 6 5 4 3

Assume each day represents one unit of exposure. We could also pass a vector of exposures if days were of different length or reliability.

2.2.3 Posterior: updated Gamma using gamma_poisson_posterior()

tickets_post <- gamma_poisson_posterior(
  prior = tickets_prior,
  y = tickets_week,
  exposure = rep(1, length(tickets_week))
)

tickets_post
#> fermi posterior
#>   Distribution: gamma 
#>   Parameters  :
#> $shape
#> [1] 29.88618
#> 
#> $rate
#> [1] 8.032756
#> 
#>   Derived from prior (expert)

Summarise the rate (tickets per day):

posterior_summary(tickets_post, cred_level = 0.9)
#> Posterior summary
#>   Credible level: 90 %
#>   Median        : 3.674416 
#>   CI            : [2.63361, 4.926418]
#>   HDI           : [2.521482, 4.790498]

We can also look at the implied distribution of tickets per day directly:

plot_posterior(tickets_post, cred_level = 0.9)

If you want to see the distribution of the expected tickets over a month (say 30 days), just transform the draws:

lambda_draws <- bf_draws(tickets_post, n = 4000)
tickets_month <- lambda_draws * 30

posterior_summary(tickets_month, cred_level = 0.89)
#> Posterior summary
#>   Credible level: 89 %
#>   Median        : 110.7253 
#>   CI            : [82.10701, 145.8198]
#>   HDI           : [82.08899, 145.8115]
plot_posterior(tickets_month, cred_level = 0.89)

Interpretation:

  • The Gamma–Poisson update smoothly blends your prior expectation with observed counts.
  • As you accumulate more days of data, the posterior for the rate narrows.

3. normal_normal_posterior(): learning a mean with known noise

3.1 When and why

Use the Normal–Normal model when:

  • You care about an unknown mean μ\mu.
  • Individual observations are Normally distributed around that mean with a known standard deviation σ\sigma.

Mathematically:

  • Prior: μNormal(μ0,τ02)\mu \sim \mathrm{Normal}(\mu_0, \tau_0^2)
  • Data: yiμNormal(μ,σ2)y_i \mid \mu \sim \mathrm{Normal}(\mu, \sigma^2) with a known σ\sigma.
  • Posterior: μyNormal(μn,τn2)\mu \mid y \sim \mathrm{Normal}(\mu_n, \tau_n^2).

This is natural when measurement noise is well-understood, e.g. measurement devices with known precision or settings where variability is stable and well-estimated from past studies.

3.2 Example: Average response time

Suppose you’re tracking the average response time (in minutes) to a certain type of request.

From historical systems you think:

  • The average response time is around 60 minutes.
  • 90% of your prior belief is between 40 and 80 minutes.

You also believe that, for individual requests, the time is noisy with a standard deviation of about 15 minutes (due to variation in complexity, staff availability, etc.).

3.2.1 Prior: Normal for the mean response time

response_prior <- expert_normal_prior(
  mean = 60,
  lower = 40,
  upper = 80,
  level = 0.9
)

response_prior
#> fermi prior
#>   Distribution: normal 
#>   Source      : expert 
#>   Parameters  :
#> $mean
#> [1] 60
#> 
#> $sd
#> [1] 12.15914

3.2.2 Data: observed response times

You collect response times for 20 requests (in minutes):

# Known observation standard deviation
obs_sd <- 15

response_times <- round(stats::rnorm(20, mean = 55, sd = obs_sd))
response_times
#>  [1] 68 68 81 82 79 52 57 67 48 50 39 40 46 81 51 45 39 37 86 77

3.2.3 Posterior: updated Normal using normal_normal_posterior()

response_post <- normal_normal_posterior(
  prior = response_prior,
  y = response_times,
  obs_sd = obs_sd
)

response_post
#> fermi posterior
#>   Distribution: normal 
#>   Parameters  :
#> $mean
#> [1] 59.67475
#> 
#> $sd
#> [1] 3.233339
#> 
#>   Derived from prior (expert)

Summarise the posterior mean:

posterior_summary(response_post, cred_level = 0.9)
#> Posterior summary
#>   Credible level: 90 %
#>   Median        : 59.73669 
#>   CI            : [54.51551, 65.14882]
#>   HDI           : [54.64528, 65.23133]

Visualise the posterior distribution for the mean response time:

plot_posterior(response_post, cred_level = 0.9)

Interpretation:

  • The posterior mean moves from the prior mean (60 minutes) towards the sample mean.
  • How far it moves depends on:
    • The prior uncertainty (how wide the prior is).
    • The number of observations and their variability.
  • With more data, the posterior becomes increasingly dominated by the data.

4. Putting it into a Fermi-style context

The conjugate update functions in fermi are designed to play nicely with how you’d approach Fermi estimation based on prior knowledge and observed data. Conjugate helpers give you fast, transparent updates: write down a defensible prior, add evidence, and immediately see the revised belief. Swap in your own priors and data to fit your domain.

A common workflow looks like this:

In this way, conjugate updates give you a data-informed starting point for Fermi estimations, while the Fermi simulation lets you combine multiple uncertain components into a final quantity of interest.

Summary

Each function takes a bf_prior and returns a bf_posterior, which can be:

These are small, composable building blocks for Bayesian reasoning about real-world quantities.