Skip to contents

These functions accept a data frame and perhaps test specific arguments (like whether or not the test will be asymptotic or simulation based). It produces a p-value.

Usage

pIndepDist(
  dat,
  fmla = YcontNorm ~ trtF | blockF,
  simthresh = 20,
  sims = 1000,
  parallel = "yes",
  ncpu = NULL,
  groups = NULL,
  distfn = dists_and_trans,
  adaptive_dist_function = TRUE
)

Arguments

dat

An object inheriting from class data.frame

fmla

A formula appropriate to the function. Here it should be something like outcome~treatment|block

simthresh

is the size of the data below which we use direct permutations for p-values

sims

Either NULL (meaning use an asymptotic reference dist) or a number (meaning sampling from the randomization distribution implied by the formula)

parallel

is "no" then parallelization is not required, otherwise it is "multicore" or "snow" in the call to coin::independence_test() (see help for coin::approximate()). Also, if parallel is not "no" and adaptive_dist_function is TRUE, then an openmp version of the distance creation function is called using ncpu threads (or parallel::detectCores(logical=FALSE) cores).

ncpu

is number of cpus to be used for parallel operation.

groups

is a vector defining the groups within which the inter-unit distances are calculated. Not used here.

distfn

is a function that produces one or more vectors (a data frame or matrix) of the same number of rows as the dat

adaptive_dist_function

is TRUE if the distance calculation function should be chosen using previous benchmarks. See the code.

Value

A p-value

Details

For now, this function does an omnibus-style chi-square test using (1) the ratio of distances to controls to distances to treated observations within block; (2) the rank of distances to controls for each unit; and (3) the raw outcome.

Although the distances are calculated by block, our profiling suggests that it is better to parallelize the distance creation distfn (done here in C++ in the fastfns.cpp file) rather than use the data.table approach of setDTthreads(). So, here we assume that the threads for data.table are 1.