Skip to contents

The problem

A researcher classifying within-case evidence for a working theory (H1H_1) against a single rival (HRH_R) wants to summarize the weight of evidence with a Bayes factor. The challenge is that the per-observation probabilities a Bayes factor requires are hard to motivate without overcommitting. DrWrinch supplies two fully specified generative models that produce those probabilities, both tilted against H1H_1 by construction, so the reported Bayes factor is a lower bound on the evidence in favor of the working theory.

This vignette works the running example from Lopez, Bowers, and Gajardo Cooper (2026): seven pieces of evidence favoring H1H_1 and three favoring HRH_R.

Two models

Binomial model. Each observation is an independent Bernoulli draw from an infinite evidence universe with success probability θ\theta. H1H_1 corresponds to θ>1/2\theta > 1/2. Under a uniform prior on θ\theta, the Bayes factor is the ratio of posterior masses on the two sides of 1/21/2. This model fits research designs where the evidence universe is open-ended: ongoing interviews, an expanding archive.

bf_binomial(y_W = 7, y_R = 3)
#> [1] 7.827586

Hypergeometric urn model (Formulation C). Evidence is drawn without replacement from a finite urn. The Working Theory Favorable (WTF) sub-model has urn composition (yW+1,max(1,yR))(y_W + 1, \max(1, y_R)); the Rival Theory Favorable (RTF) sub-model has urn composition (yW,yW+1)(y_W, y_W + 1). The construction adds one unobserved pro-H1H_1 item to the working theory’s urn and tilts the rival’s urn toward HRH_R by exactly one item. This model fits research designs where the evidence base is bounded: a closed historical archive, a fixed roster of documents.

bf_urn(y_W = 7, y_R = 3)
#> [1] 39

The hypergeometric BF of 39 has a closed form: (8/11)/(56/3003)=39(8/11) / (56/3003) = 39.

When does each model apply?

Design Model
Ongoing interviews, growing archive, open-ended fieldwork bf_binomial()
Closed historical archive, fixed roster of documents bf_urn()

When in doubt, compute both. They report different but compatible quantities: the binomial conditions on the evidence universe being infinite; the urn conditions on it being bounded.

A property worth knowing

Under symmetric evidence (yW=yRy_W = y_R), both Bayes factors are exactly 11 – no evidence in either direction.

bf_binomial(5, 5)
#> [1] 1
bf_urn(5, 5)
#> [1] 1

For bf_urn() this is the central property distinguishing Formulation C from naive constructions: the urn model returns 1 rather than \infty or 00 at the equipoise point.

Sensitivity to observation bias

A reviewer might ask: how robust is the conclusion to the possibility that pro-H1H_1 items were more likely to be observed than pro-HRH_R items? sens_urn() reports the smallest observation-bias factor ω>1\omega > 1 at which the Bayes factor first drops below a decision threshold (the default threshold is 2020, the conventional cutoff for “strong” evidence).

s_urn <- sens_urn(7, 3, threshold = 20)
s_urn$bf
#> [1] 39
s_urn$omega_star
#> [1] 1.304023

The reading: pro-H1H_1 items would have had to be roughly 1.3 times more likely to be noticed than pro-HRH_R items before the conclusion reverses.

Sensitivity to the prior on θ\theta

For the binomial model, sens_binomial() additionally reports the smallest rival-tilted prior at which the conclusion reverses. The prior Beta(1,M+1)\text{Beta}(1, M + 1) posits MM pseudo-observations all favoring the rival; M_star is the smallest such MM that drives the Bayes factor below threshold.

s_binom <- sens_binomial(17, 3, threshold = 20)
s_binom$bf
#> [1] 1341.607
s_binom$omega_star
#> [1] 2.185972
s_binom$M_star
#> [1] 6

In this stronger example (17 pro-H1H_1 items, 3 pro-HRH_R items), neither the observation bias nor a moderate number of rival-favoring pseudo-observations is enough to overturn the conclusion.

Weighted evidence

A researcher who believes one observation carries more probative weight than the others can sum integer weights and pass the totals:

# Six pro-H1 items at weight 1 and one "smoking gun" at weight 10
w_W <- c(10, rep(1, 6))
w_R <- rep(1, 3)
bf_binomial(sum(w_W), sum(w_R))
#> [1] 775.148
bf_urn(sum(w_W), sum(w_R))
#> [1] 1023512

The “+1” rival-favoring construction goes through unchanged. See the paper’s discussion of probative weight for the rationale.