This page presents an example of a permutation test based on the transportation of a bivariate test statistic \(\mathbf{T}\) for a three sample problem from Section 2 of the paper Hlávka et al. (2025).
Consider three independent random samples from shifted continuous distributions. Namely, let \(f\) be a density with a finite variance, and \(X_{i,j}\), \(j=1,\dots,n_i\), is a random sample from a distribution with density \(f(x-\mu_i)\) for \(i=1,2,3\).
The null hypothesis \[ H_0:\mu_1=\mu_2=\mu_3 \] is equivalent to the pair of hypothesis \[ H_{0,1}: \mu_1=\mu_2, \quad H_{0,2}: \mu_1=\mu_3. \] Each hypothesis \(H_{0,k}\), \(k=1,2\), may be tested separately by a two sample t-test with. Let \(T_k\) be the corresponding test statistic, \(k=1,2\). We use a bivariate test statistic \(\mathbf{T}=(T_1,T_2)^\top\) for testing \(H_0\), so the dimension of the problem is \(d=2\).
The data are generated independently from normal distributions, that is \(X_{i,j}\sim N(\mu_i,\sigma^2)\), \(j=1,\dots,n_i\), for \(i\in\{1,2,3\}\) and for \(n_1=n_2=n_3\). The parameters are set as \(\mu_1=\mu_2=0\), \(\mu_3=2\), \(\sigma^2=1\), \(n_1 =5\).
set.seed(123)
n1 <- 5
x1 <- rnorm(n1, 0, 1)
x2 <- rnorm(n1, 0, 1)
x3 <- rnorm(n1, 2, 1)
x <- c(x1, x2, x3)
id <- (c(rep(1,n1),rep(2,n1),rep(3,n1)))
The resulting data are shown in the following plot:
The test statistics \(T_1\) and \(T_2\) are computed and stored in a 2-dimensional vector \(\mathbf{T}_0\).
T1.0 <- t.test(x[1:n1], x[(n1+1):(2*n1)])$stat
T2.0 <- t.test(x[1:n1], x[(2*n1+1):(3*n1)])$stat
(T.0 <- c(T1.0, T2.0))
## t t
## 0.3750625 -4.5776556
In order to conduct the multivariate permutation test, recall first the procedure summarized in Algorithm 1 of the paper:
Algorithm 1.
Input: A \(d\)-variate test statistic \(\mathbf{T}_0\) for testing \(H_0\).
If the null hypothesis is rejected, it is also desirable to
According to the recommendations from Section 4.5 of the paper Hlávka et al. (2025), it is reasonable to choose \(B\approx 1000\), but for a better visualisation we choose \(B=399\), so a grid of \(B+1=400\) is used. In this example we utilize a product type grid with \(n_R=20\) and \(n_S=20\).
For \(d=2\) it is easy to construct \(n_S\) vectors regularly spaced on the unit sphere. The grid points are of the form \[ \mathbf{g}_{ij} = \frac{i}{n_R+1} \left(\cos(2\pi \varphi_j),\sin (2\pi \varphi_j) \right)^\top, \quad \varphi_j = \frac{j-1}{n_S}, \] \(i=1,\dots,n_R\), \(j=1,\dots,n_S\).
n.R = 20; n.S = 20; n0 = 0
xs1 <- rep(1:n.R, each = n.S)/(n.R + 1)
xs2 <- rep((0:(n.S-1))/n.S, n.R)
s1 <- c(xs1 * cos(2 * pi * xs2), rep(0,n0))
s2 <- c(xs1 * sin(2 * pi * xs2), rep(0,n0))
grid <- cbind(s1, s2)
plot(grid, cex = 0.3, pch = 19, asp = 1, xlab = expression(g[1]), ylab = expression(g[2]))
Generated product type grid of 400 points.
In this step, the vectors \(\mathbf{T}_1,\dots,\mathbf{T}_B\) of permutation test statistics are computed.
set.seed(123)
B <- n.R*n.S-1
# perm. test statistics T_b (in rows):
T.b <- matrix(NA, ncol = 2, nrow = B)
for(b in 1:B){
x.b <- sample(x, size = length(x), replace = FALSE)
T.b[b, 1] = t.test(x.b[1:n1], x.b[(n1+1):(2*n1)])$stat
T.b[b, 2] = t.test(x.b[1:n1], x.b[(2*n1+1):(3*n1)])$stat
}
Since we work in \(d=2\), we can visualize the situation:
plot(T.b, pch = 21, col = "blue", bg = "lightblue", type = "p", xlab = expression(T[1]), ylab = expression(T[2]))
points(T.0[1], T.0[2], pch = 19, col = "red")
text(T.0[1]-0.3, T.0[2]-0.8, expression(bold(T)[0]), col = "red")
It is visible that the test statistic \(\mathbf{T}_0\) (the red point) lies outside the main cloud, which indicates violation of the null hypothesis.
In this step we compute the discrete optimal transport \(F_{\pm}^{(B+1)}\) of \(\mathbf{T}_0,\mathbf{T}_1,\dots,\mathbf{T}_B\) to the target grid. The optimization is performed using the function solve_LSAP from the package clue. The data to be transported are stored in \(400\times 2\) matrix y such that the first row corresponds to the test statistic \(\mathbf{T}_0\) computed from the original data and the remaining rows are the permutation replicas \(\mathbf{T}_b\), \(b=1,\dots,B=399\). The \(400\times 400\) matrix costes contains the squared \(L_2\) distances \(\| \mathbf{T}_b - \mathbf{g}_b\|^2\).
require(clue)
sqdist <- function(a,b){
return(abs(a - b)^2)
}
y <- rbind(T.0, T.b)
costes <- t(outer(grid[,1], y[,1], "sqdist")) + t(outer(grid[, 2], y[, 2], "sqdist"))
asignacion <- solve_LSAP(costes)
FB <- grid[asignacion, ]
Now FB[1,] is \(F_{\pm}^{(B+1)}(\mathbf{T}_0)\) and FB[b+1,] corresponds to \(F_{\pm}^{(B+1)}(\mathbf{T}_b)\). We can stress \(F_{\pm}^{(B+1)}(\mathbf{T}_0)\) by a red color in the grid:
plot(FB, pch = 21, col = "blue", bg = "lightblue", type = "p", cex = 0.3, xlab = expression(g[1]), ylab = expression(g[2]))
points(FB[1,1], FB[1,2], pch = 19, col = "red")
We see that \(F_{\pm}^{(B)}(\mathbf{T}_0)\) lies on the
most outer orbit, so \(\mathbf{T}_0\)
belongs to \(n_S\) most extreme
cases.
In order to better understand the optimal transport, we can color some more points to see where they are mapped:
The p-value \(\widehat{p}_a\) is obtained by the formula \[ \widehat{p}_a =1-\|F_{\pm}^{(B)}(\mathbf T_0)\|. \]
vnorm <- function(x) sqrt(sum(x^2))
(p.a <- 1-vnorm(FB[1,]))
## [1] 0.04761905
Notice that in this example p-value 0.0476 cannot be interpreted as borderline (as in the classical statistical inference), because \(\widehat{p}_a\) takes values in the set \(\{1/(n_R+1),\dots, n_R/(n_R+1)\}\). For our choice \(n_R=20\), the smallest possible p-value is equal to 0.0476.
Alternatively, we can compute \(\widehat{p}_e\) according to the formula \[ \widehat{p}_e = \frac{1}{B+1} \left(1+\sum_{b=1}^{B} \boldsymbol{1}\bigl(\|F_{\pm}^{(B)}(\mathbf{T}_{b})\| \geq \|F_{\pm}^{(B)}(\mathbf{T}_{0})\| \bigr)\right). \]
We already know from the plot above that \(F_{\pm}^{(B)}(\mathbf{T}_0)\) lies on the most outer orbit, so there are \(n_S-1\) permutation test statistics which have norm the same or greater, and since \(B+1=n_Rn_S\), it holds \[ \widehat{p}_e=\frac{1+n_S-1}{n_R n_S}=\frac{1}{n_R} \] which is 0.05.
# vector of norms:
FB.norms <- apply(FB,1,vnorm)
(p.e <- 1/(B + 1) * (1 + sum(FB.norms[-1] >= FB.norms[1])))
## [1] 0.05
In some situations it can happen that the above R formula for \(\widehat{p}_e\) yields a value less than 0.05, even though \(p_e\) cannot be smaller than 0.05, or in general, it does not give the correct value of \(p_e\). The problem is with rounding of real values. Theoretically, we should get \(n.R\) different values for \(\|F_{\pm}^{(B)}(\cdot)\|\), each repeated for \(n.S\) observations on the same orbit. However, this is not necessarily the case, due to rounding.
Hence, in general, we have to be more careful and solve this problem by suitable rounding of the values.
FB.norms <- round(FB.norms, dig = 6)
(p.e <- 1/(B + 1) * (1 + sum(FB.norms[-1] >= FB.norms[1])))
## [1] 0.05
This gives always a correct p-value. Again, this p-value cannot be interpreted as borderline, because it is the smallest possible value which can be obtained for the chosen grid.
We have seen that \(F_{\pm}^{(B+1)}(\mathbf{T}_0)\) lies close to the point \((0,-1)^\top\), which indicates that the null hypothesis is rejected mainly due to violation of \(H_{0,2}\), and that the sign of \(\mu_1-\mu_3\) seems to be negative. Alternatively, we can compute the vector of partial contributions, denoted as \(\mathbf{D}\) in the paper.
D <- FB[1, ]/vnorm(FB[1, ])
# contributions in percentages
D^2*100
## s1 s2
## 9.54915 90.45085
The contribution of the second test statistic to the rejection of \(H_0\) is 90 %. The first test statistic contributes only by 10 %.