Skip to contents

Introduction

The manytestsr package incorporates state-of-the-art statistical methodologies for hierarchical testing and multiple comparison corrections. This vignette demonstrates advanced features based on foundational works by:

  • Jelle Goeman (Closed Testing Procedures)
  • Aaditya Ramdas (E-value Methodology)
  • Nicolai Meinshausen (Hierarchical Variable Importance Testing)
  • Paul Rosenbaum (Design Sensitivity Analysis)

These methodologies address key challenges in modern statistical inference: maintaining statistical validity while maximizing power in complex hierarchical testing scenarios.

Setup and Data Preparation

library(manytestsr)
library(data.table)
library(dplyr)
library(ggplot2)

# Load example data
data(example_dat, package = "manytestsr")

# Prepare individual-level data
idat <- as.data.table(example_dat)

# Create block-level summary with enhanced metrics
bdat <- idat %>%
  group_by(blockF) %>%
  summarize(
    nb = n(),
    pb = mean(trt),
    hwt = (nb / nrow(idat)) * (pb * (1 - pb)),
    place = first(place),
    year = first(year),
    .groups = "drop"
  ) %>%
  as.data.table()

cat("Dataset:", nrow(idat), "individuals across", nrow(bdat), "blocks\n")
#> Dataset: 1268 individuals across 44 blocks

1. Goeman’s Closed Testing Procedure

Theory

Goeman’s closed testing procedure (Goeman & Solari, 2011) provides a powerful approach to hierarchical testing that:

  • Controls FWER strongly across all hypotheses
  • Exploits hierarchical structure for increased power
  • Uses intersection-union tests for logical consistency

The key insight: test all possible intersections of hypotheses and reject individual hypotheses only if all containing intersection hypotheses are rejected.

Implementation

# Run hierarchical testing with Goeman's closed testing
result_goeman <- find_blocks(
  idat = idat,
  bdat = bdat,
  blockid = "blockF",
  splitfn = splitCluster,
  pfn = pIndepDist, # Robust test as Goeman would recommend
  fmla = Y1 ~ trtF | blockF,
  splitby = "hwt",
  parallel = "no",
  use_closed_testing = TRUE,
  closed_testing_method = "simes", # Goeman & Solari's recommended method
  thealpha = 0.05,
  maxtest = 15,
  trace = FALSE
)

# Examine results
if ("closed_testing_reject" %in% names(result_goeman$node_dat)) {
  closed_rejections <- sum(result_goeman$node_dat$closed_testing_reject, na.rm = TRUE)
  cat("Closed testing rejections:", closed_rejections, "\n")

  # Show rejected nodes
  rejected_nodes <- result_goeman$node_dat[closed_testing_reject == TRUE]
  if (nrow(rejected_nodes) > 0) {
    cat("Rejected node details:\n")
    print(rejected_nodes[, .(nodenum, p, closed_testing_reject, depth)])
  }
} else {
  cat("Closed testing procedure was not applied (likely due to early termination)\n")
}

Key Features of Closed Testing

# Compare traditional vs closed testing power
traditional_detections <- report_detections(result_goeman$bdat, fwer = TRUE)
traditional_hits <- sum(traditional_detections$hit, na.rm = TRUE)

cat("Comparison of Methods:\n")
cat("- Traditional FWER control:", traditional_hits, "detections\n")
if ("closed_testing_reject" %in% names(result_goeman$node_dat)) {
  closed_hits <- sum(result_goeman$node_dat$closed_testing_reject, na.rm = TRUE)
  cat("- Goeman closed testing:", closed_hits, "detections\n")
  cat("- Power improvement:", closed_hits - traditional_hits, "\n")
}

2. Meinshausen’s Hierarchical Testing with Sequential Rejection

Theory

Meinshausen (2008) addresses high-dimensional variable selection by:

  • Testing variable groups hierarchically rather than individually
  • Maintaining power when variables are correlated
  • Sequential rejection (enhanced by Goeman & Solari 2010) for improved efficiency

Implementation

# Apply Meinshausen's hierarchical testing with sequential rejection
result_meinshausen <- find_blocks(
  idat = idat,
  bdat = bdat,
  blockid = "blockF",
  splitfn = splitCluster,
  pfn = pIndepDist,
  fmla = Y2 ~ trtF | blockF,
  splitby = "hwt",
  parallel = "no",
  use_meinshausen = TRUE,
  meinshausen_method = "simes",
  meinshausen_sequential = TRUE, # Use Goeman-Solari enhancement
  thealpha = 0.05,
  maxtest = 15,
  trace = FALSE
)

# Examine Meinshausen results
if ("meinshausen_reject" %in% names(result_meinshausen$node_dat)) {
  meinshausen_rejections <- sum(result_meinshausen$node_dat$meinshausen_reject, na.rm = TRUE)
  cat("Meinshausen hierarchical rejections:", meinshausen_rejections, "\n")

  # Show rejected nodes with adjusted p-values
  rejected_meinshausen <- result_meinshausen$node_dat[meinshausen_reject == TRUE]
  if (nrow(rejected_meinshausen) > 0) {
    cat("Meinshausen rejected nodes:\n")
    print(rejected_meinshausen[, .(nodenum, p, meinshausen_adjusted_p, meinshausen_reject, depth)])
  }
} else {
  cat("Meinshausen testing was not applied\n")
}

Sequential vs Traditional Meinshausen

# Compare sequential vs traditional Meinshausen
result_meinshausen_trad <- find_blocks(
  idat = idat,
  bdat = bdat,
  blockid = "blockF",
  splitfn = splitCluster,
  pfn = pIndepDist,
  fmla = Y2 ~ trtF | blockF,
  splitby = "hwt",
  parallel = "no",
  use_meinshausen = TRUE,
  meinshausen_method = "simes",
  meinshausen_sequential = FALSE, # Traditional approach
  thealpha = 0.05,
  maxtest = 15,
  trace = FALSE
)

# Compare approaches
if (all(c("meinshausen_reject") %in% names(result_meinshausen$node_dat))) {
  seq_rejections <- sum(result_meinshausen$node_dat$meinshausen_reject, na.rm = TRUE)

  if ("meinshausen_reject" %in% names(result_meinshausen_trad$node_dat)) {
    trad_rejections <- sum(result_meinshausen_trad$node_dat$meinshausen_reject, na.rm = TRUE)

    cat("Meinshausen Method Comparison:\n")
    cat("- Sequential rejection approach:", seq_rejections, "rejections\n")
    cat("- Traditional approach:", trad_rejections, "rejections\n")
    cat("- Improvement from sequential:", seq_rejections - trad_rejections, "\n")
  }
}

3. Ramdas E-value Methodology

Theory

E-values (Ramdas et al.) provide a modern framework for:

  • Sequential testing without fixed sample sizes
  • Always-valid inference regardless of stopping rules
  • Wealth accumulation metaphor for evidence gathering

Implementation

# Apply e-value methodology (currently experimental)
result_evalues <- find_blocks(
  idat = idat,
  bdat = bdat,
  blockid = "blockF",
  splitfn = splitCluster,
  pfn = pOneway,
  fmla = Y1 ~ trtF | blockF,
  splitby = "hwt",
  parallel = "no",
  use_evalues = TRUE,
  evalue_wealth_rule = "kelly",
  thealpha = 0.05,
  maxtest = 10,
  trace = FALSE
)

cat("E-value methodology provides always-valid inference\n")
cat("Particularly useful for:\n")
cat("- Sequential data collection\n")
cat("- Optional stopping\n")
cat("- Online learning scenarios\n")

4. Rosenbaum’s Design Sensitivity Analysis

Theory

Rosenbaum’s design sensitivity analysis evaluates:

  • Robustness to unobserved confounding
  • Sensitivity parameters (Γ values) for causal claims
  • Bounds on treatment effects under confounding scenarios

Simulated Example

# Create simulated data with potential confounding structure
set.seed(123)
n_blocks <- 20
n_per_block <- 15

# Simulate data with block-level confounders
sensitivity_data <- data.table(
  block = rep(paste0("B", 1:n_blocks), each = n_per_block),
  unit_id = 1:(n_blocks * n_per_block)
)

# Add treatment assignment with potential bias
sensitivity_data[, block_confounder := rnorm(1), by = block]
sensitivity_data[, treatment := rbinom(.N, 1, plogis(0.2 * block_confounder))]

# Add outcomes with treatment effect + confounding
sensitivity_data[, outcome := rnorm(.N,
  mean = 0.3 * treatment + 0.4 * block_confounder,
  sd = 1
)]

cat("Simulated sensitivity analysis dataset:\n")
#> Simulated sensitivity analysis dataset:
cat("- Blocks with varying confounding levels\n")
#> - Blocks with varying confounding levels
cat("- Treatment effects potentially biased by unobserved factors\n")
#> - Treatment effects potentially biased by unobserved factors
cat("- Gamma sensitivity parameters can assess robustness\n")
#> - Gamma sensitivity parameters can assess robustness

5. Comprehensive Method Comparison

Power and Error Control Comparison

# Compare all methods on same outcome
methods_comparison <- data.frame(
  Method = character(),
  Rejections = numeric(),
  Power_Metric = numeric(),
  Conservative_Level = character(),
  stringsAsFactors = FALSE
)

# Traditional approach
traditional_hits <- sum(traditional_detections$hit, na.rm = TRUE)
methods_comparison <- rbind(methods_comparison, data.frame(
  Method = "Traditional FWER",
  Rejections = traditional_hits,
  Power_Metric = traditional_hits,
  Conservative_Level = "High"
))

# Goeman closed testing
if ("closed_testing_reject" %in% names(result_goeman$node_dat)) {
  closed_hits <- sum(result_goeman$node_dat$closed_testing_reject, na.rm = TRUE)
  methods_comparison <- rbind(methods_comparison, data.frame(
    Method = "Goeman Closed Testing",
    Rejections = closed_hits,
    Power_Metric = closed_hits,
    Conservative_Level = "Medium"
  ))
}

# Meinshausen hierarchical
if ("meinshausen_reject" %in% names(result_meinshausen$node_dat)) {
  meinshausen_hits <- sum(result_meinshausen$node_dat$meinshausen_reject, na.rm = TRUE)
  methods_comparison <- rbind(methods_comparison, data.frame(
    Method = "Meinshausen Sequential",
    Rejections = meinshausen_hits,
    Power_Metric = meinshausen_hits,
    Conservative_Level = "Low"
  ))
}

print(methods_comparison)

Visualization of Method Performance

if (nrow(methods_comparison) > 1) {
  # Create comparison plot
  ggplot(methods_comparison, aes(x = Method, y = Rejections, fill = Conservative_Level)) +
    geom_col(alpha = 0.7) +
    geom_text(aes(label = Rejections), vjust = -0.5) +
    labs(
      title = "Comparison of Advanced Testing Methodologies",
      subtitle = "Number of rejections by method",
      x = "Statistical Method",
      y = "Number of Rejections",
      fill = "Conservativeness"
    ) +
    theme_minimal() +
    theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
    scale_fill_manual(values = c("High" = "#d62728", "Medium" = "#ff7f0e", "Low" = "#2ca02c"))
}

6. Practical Guidelines

When to Use Each Method

Method Best For Key Advantage Considerations
Goeman Closed Testing Hierarchical hypotheses with logical structure Strong FWER control with power gains Computational complexity
Meinshausen Sequential High-dimensional problems with correlated variables Excellent power for group testing Requires hierarchical clustering
Ramdas E-values Sequential data collection Always-valid inference Still experimental in package
Rosenbaum Sensitivity Causal inference with confounding concerns Robustness assessment Requires sensitivity parameter selection

Implementation Workflow

# Step 1: Prepare data with appropriate block-level covariates
bdat <- prepare_block_data(idat, block_vars = c("size", "location", "time"))

# Step 2: Choose methodology based on research question
if (hierarchical_structure) {
  use_closed_testing <- TRUE
}

if (high_dimensional && correlated_blocks) {
  use_meinshausen <- TRUE
}

if (sequential_data_collection) {
  use_evalues <- TRUE
}

# Step 3: Apply comprehensive testing
results <- find_blocks(
  idat = idat, bdat = bdat,
  use_closed_testing = use_closed_testing,
  use_meinshausen = use_meinshausen,
  use_evalues = use_evalues,
  # ... other parameters
)

# Step 4: Validate results with sensitivity analysis
if (causal_inference_context) {
  sensitivity_results <- design_sensitivity_analysis(
    idat, bdat,
    formula = outcome ~ treatment,
    gamma_range = seq(1, 2, by = 0.1)
  )
}

7. Advanced Features and Extensions

Consonance Property Checking

Advanced hierarchical testing procedures should satisfy the consonance property - if a hypothesis is rejected, all more specific hypotheses should be testable.

# Check logical consistency of hierarchical rejections
consonance_check <- check_consonance_property(
  result_goeman$node_dat,
  result_goeman$node_tracker,
  rejection_column = "closed_testing_reject"
)

if (!consonance_check$is_consonant) {
  warning("Consonance violations detected in hierarchical testing")
}

Multiple Outcome Testing

# Test multiple outcomes with joint error control
outcomes <- c("Y1", "Y2")
joint_results <- lapply(outcomes, function(outcome) {
  formula_str <- paste(outcome, "~ trtF | blockF")
  find_blocks(
    idat = idat, bdat = bdat,
    fmla = as.formula(formula_str),
    use_closed_testing = TRUE,
    use_meinshausen = TRUE
  )
})

Conclusion

The manytestsr package integrates cutting-edge statistical methodologies for hierarchical testing:

  1. Goeman’s closed testing provides rigorous FWER control with power improvements
  2. Meinshausen’s hierarchical approach excels in high-dimensional, correlated settings
  3. Ramdas e-values enable always-valid sequential inference
  4. Rosenbaum’s sensitivity analysis assesses robustness to confounding

These methods collectively address the modern challenges of multiple testing in complex experimental designs while maintaining statistical rigor and maximizing power to detect true effects.

References

  • Goeman, J. J., & Solari, A. (2011). Multiple testing for exploratory research. Statistical Science, 26(4), 584-597.
  • Meinshausen, N. (2008). Hierarchical testing of variable importance. Biometrika, 95(2), 265-278.
  • Ramdas, A., Ruf, J., Larsson, M., & Koolen, W. M. (2020). A unified treatment of multiple testing with prior knowledge using the e-value. Annals of Statistics, 48(5), 2790-2807.
  • Rosenbaum, P. R. (2017). Observation and Experiment: An Introduction to Causal Inference. Harvard University Press.