P-value function: Independence Treatment Distance Test
pIndepDist.Rd
These functions accept a data frame and perhaps test specific arguments (like whether or not the test will be asymptotic or simulation based). It produces a p-value.
Usage
pIndepDist(
dat,
fmla = YcontNorm ~ trtF | blockF,
simthresh = 20,
sims = 1000,
parallel = "yes",
ncpu = NULL,
groups = NULL,
distfn = dists_and_trans,
adaptive_dist_function = TRUE
)
Arguments
- dat
An object inheriting from class data.frame
- fmla
A formula appropriate to the function. Here it should be something like outcome~treatment|block
- simthresh
is the size of the data below which we use direct permutations for p-values
- sims
Either NULL (meaning use an asymptotic reference dist) or a number (meaning sampling from the randomization distribution implied by the formula)
- parallel
is "no" then parallelization is not required, otherwise it is "multicore" or "snow" in the call to
coin::independence_test()
(see help for coin::approximate()). Also, if parallel is not "no" andadaptive_dist_function
is TRUE, then an openmp version of the distance creation function is called usingncpu
threads (orparallel::detectCores(logical=FALSE)
cores).- ncpu
is number of cpus to be used for parallel operation.
- groups
is a vector defining the groups within which the inter-unit distances are calculated. Not used here.
- distfn
is a function that produces one or more vectors (a data frame or matrix) of the same number of rows as the dat
- adaptive_dist_function
is TRUE if the distance calculation function should be chosen using previous benchmarks. See the code.
Details
For now, this function does an omnibus-style chi-square test using (1) the ratio of distances to controls to distances to treated observations within block; (2) the rank of distances to controls for each unit; and (3) the raw outcome.
Although the distances are calculated by block, our profiling suggests that it is better to parallelize the distance creation distfn
(done here in C++ in the fastfns.cpp
file) rather than use the data.table
approach of setDTthreads()
. So, here we assume that the threads for data.table are 1.