Block-randomized chunk generation uses fast paths for common small-block cases (e.g., mb == 1, mb == nb - 1) and an internal cache for small combn(nb, mb) patterns.
Speed up comb.method = 2 null distribution:
com_null_dist_block_stratum() now computes per-block statistics with matrix operations instead of per-permutation nested loops.
Reduce allocation/copying overhead in solver setup:
Solver constraint triplets are now assembled via list accumulation rather than repeated c() growth.
rank_score() consistently returns a numeric vector (avoids 1-column matrix output when scale = TRUE).
Testing
Added unit tests for internal block-permutation chunk generation and for equivalence of the vectorized com_null_dist_block_stratum() implementation to a naive reference computation.