Back to the main page.


This page presents an example of a permutation test based on the transportation of a bivariate test statistic \(\mathbf{T}\) for a three sample problem from Section 2 of the paper Hlávka et al. (2025).

Problem description

Consider three independent random samples from shifted continuous distributions. Namely, let \(f\) be a density with a finite variance, and \(X_{i,j}\), \(j=1,\dots,n_i\), is a random sample from a distribution with density \(f(x-\mu_i)\) for \(i=1,2,3\).

The null hypothesis \[ H_0:\mu_1=\mu_2=\mu_3 \] is equivalent to the pair of hypothesis \[ H_{0,1}: \mu_1=\mu_2, \quad H_{0,2}: \mu_1=\mu_3. \] Each hypothesis \(H_{0,k}\), \(k=1,2\), may be tested separately by a two sample t-test with. Let \(T_k\) be the corresponding test statistic, \(k=1,2\). We use a bivariate test statistic \(\mathbf{T}=(T_1,T_2)^\top\) for testing \(H_0\), so the dimension of the problem is \(d=2\).

Data generation and computation of the test statistic

The data are generated independently from normal distributions, that is \(X_{i,j}\sim N(\mu_i,\sigma^2)\), \(j=1,\dots,n_i\), for \(i\in\{1,2,3\}\) and for \(n_1=n_2=n_3\). The parameters are set as \(\mu_1=\mu_2=0\), \(\mu_3=2\), \(\sigma^2=1\), \(n_1 =5\).

set.seed(123)

n1 <- 5
x1 <- rnorm(n1, 0, 1)
x2 <- rnorm(n1, 0, 1)
x3 <- rnorm(n1, 2, 1)

x <- c(x1, x2, x3)
id <- (c(rep(1,n1),rep(2,n1),rep(3,n1)))

The resulting data are shown in the following plot:

The test statistics \(T_1\) and \(T_2\) are computed and stored in a 2-dimensional vector \(\mathbf{T}_0\).

T1.0 <- t.test(x[1:n1], x[(n1+1):(2*n1)])$stat
T2.0 <- t.test(x[1:n1], x[(2*n1+1):(3*n1)])$stat
(T.0 <- c(T1.0, T2.0))
##          t          t 
##  0.3750625 -4.5776556

Multivariate OT permutation test

In order to conduct the multivariate permutation test, recall first the procedure summarized in Algorithm 1 of the paper:

Algorithm 1.

Input: A \(d\)-variate test statistic \(\mathbf{T}_0\) for testing \(H_0\).

  1. Choose the number of permutations \(B\).
  2. Compute grid \(G_{B+1}\) of \(B+1\) points.
  3. Compute permutation test statistics \(\mathbf{T}_1,\dots,\mathbf{T}_B\).
  4. Compute the transport \({F}^{(B+1)}_{\pm}\).
  5. Calculate the final p-value as \(\widehat{p}_e\) or \(\widehat{p}_a\).

If the null hypothesis is rejected, it is also desirable to

  1. Interpret the partial contributions.

1. Choice of tuning parameters

According to the recommendations from Section 4.5 of the paper Hlávka et al. (2025), it is reasonable to choose \(B\approx 1000\), but for a better visualisation we choose \(B=399\), so a grid of \(B+1=400\) is used. In this example we utilize a product type grid with \(n_R=20\) and \(n_S=20\).

2. Generation of a grid

For \(d=2\) it is easy to construct \(n_S\) vectors regularly spaced on the unit sphere. The grid points are of the form \[ \mathbf{g}_{ij} = \frac{i}{n_R+1} \left(\cos(2\pi \varphi_j),\sin (2\pi \varphi_j) \right)^\top, \quad \varphi_j = \frac{j-1}{n_S}, \] \(i=1,\dots,n_R\), \(j=1,\dots,n_S\).

n.R = 20; n.S = 20; n0 = 0


xs1 <- rep(1:n.R, each = n.S)/(n.R + 1)
xs2 <- rep((0:(n.S-1))/n.S, n.R)

s1 <- c(xs1 * cos(2 * pi * xs2), rep(0,n0))
s2 <- c(xs1 * sin(2 * pi * xs2), rep(0,n0))

grid <- cbind(s1, s2)

plot(grid, cex = 0.3, pch = 19, asp = 1, xlab = expression(g[1]), ylab = expression(g[2]))
Generated product type grid of 400 points.

Generated product type grid of 400 points.

3. Permutation test statistics

In this step, the vectors \(\mathbf{T}_1,\dots,\mathbf{T}_B\) of permutation test statistics are computed.

set.seed(123)
B <- n.R*n.S-1

# perm. test statistics T_b (in rows):
T.b <- matrix(NA, ncol = 2, nrow = B) 


for(b in 1:B){
  x.b <- sample(x, size = length(x), replace = FALSE)
  T.b[b, 1] = t.test(x.b[1:n1], x.b[(n1+1):(2*n1)])$stat
  T.b[b, 2] = t.test(x.b[1:n1], x.b[(2*n1+1):(3*n1)])$stat
}

Since we work in \(d=2\), we can visualize the situation:

plot(T.b, pch = 21, col = "blue", bg = "lightblue", type = "p", xlab = expression(T[1]), ylab = expression(T[2]))
points(T.0[1], T.0[2], pch = 19, col = "red")
text(T.0[1]-0.3, T.0[2]-0.8, expression(bold(T)[0]), col = "red")

It is visible that the test statistic \(\mathbf{T}_0\) (the red point) lies outside the main cloud, which indicates violation of the null hypothesis.

4. Optimal transport

In this step we compute the discrete optimal transport \(F_{\pm}^{(B+1)}\) of \(\mathbf{T}_0,\mathbf{T}_1,\dots,\mathbf{T}_B\) to the target grid. The optimization is performed using the function solve_LSAP from the package clue. The data to be transported are stored in \(400\times 2\) matrix y such that the first row corresponds to the test statistic \(\mathbf{T}_0\) computed from the original data and the remaining rows are the permutation replicas \(\mathbf{T}_b\), \(b=1,\dots,B=399\). The \(400\times 400\) matrix costes contains the squared \(L_2\) distances \(\| \mathbf{T}_b - \mathbf{g}_b\|^2\).

require(clue)

sqdist <- function(a,b){
  return(abs(a - b)^2)
}

y <- rbind(T.0, T.b)
costes <- t(outer(grid[,1], y[,1], "sqdist")) + t(outer(grid[, 2], y[, 2], "sqdist"))
asignacion <- solve_LSAP(costes)
FB <- grid[asignacion, ]

Now FB[1,] is \(F_{\pm}^{(B+1)}(\mathbf{T}_0)\) and FB[b+1,] corresponds to \(F_{\pm}^{(B+1)}(\mathbf{T}_b)\). We can stress \(F_{\pm}^{(B+1)}(\mathbf{T}_0)\) by a red color in the grid:

plot(FB, pch = 21, col = "blue", bg = "lightblue", type = "p", cex = 0.3, xlab = expression(g[1]), ylab = expression(g[2]))

points(FB[1,1], FB[1,2], pch = 19, col = "red")

We see that \(F_{\pm}^{(B)}(\mathbf{T}_0)\) lies on the most outer orbit, so \(\mathbf{T}_0\) belongs to \(n_S\) most extreme cases.

In order to better understand the optimal transport, we can color some more points to see where they are mapped:

5. Resulting p-value

The p-value \(\widehat{p}_a\) is obtained by the formula \[ \widehat{p}_a =1-\|F_{\pm}^{(B)}(\mathbf T_0)\|. \]

vnorm <- function(x) sqrt(sum(x^2))

(p.a <- 1-vnorm(FB[1,]))
## [1] 0.04761905

Notice that in this example p-value 0.0476 cannot be interpreted as borderline (as in the classical statistical inference), because \(\widehat{p}_a\) takes values in the set \(\{1/(n_R+1),\dots, n_R/(n_R+1)\}\). For our choice \(n_R=20\), the smallest possible p-value is equal to 0.0476.

Alternatively, we can compute \(\widehat{p}_e\) according to the formula \[ \widehat{p}_e = \frac{1}{B+1} \left(1+\sum_{b=1}^{B} \boldsymbol{1}\bigl(\|F_{\pm}^{(B)}(\mathbf{T}_{b})\| \geq \|F_{\pm}^{(B)}(\mathbf{T}_{0})\| \bigr)\right). \]

We already know from the plot above that \(F_{\pm}^{(B)}(\mathbf{T}_0)\) lies on the most outer orbit, so there are \(n_S-1\) permutation test statistics which have norm the same or greater, and since \(B+1=n_Rn_S\), it holds \[ \widehat{p}_e=\frac{1+n_S-1}{n_R n_S}=\frac{1}{n_R} \] which is 0.05.

# vector of norms:
FB.norms <- apply(FB,1,vnorm)

(p.e <- 1/(B + 1) * (1 + sum(FB.norms[-1] >= FB.norms[1])))
## [1] 0.05

In some situations it can happen that the above R formula for \(\widehat{p}_e\) yields a value less than 0.05, even though \(p_e\) cannot be smaller than 0.05, or in general, it does not give the correct value of \(p_e\). The problem is with rounding of real values. Theoretically, we should get \(n.R\) different values for \(\|F_{\pm}^{(B)}(\cdot)\|\), each repeated for \(n.S\) observations on the same orbit. However, this is not necessarily the case, due to rounding.

Hence, in general, we have to be more careful and solve this problem by suitable rounding of the values.

FB.norms <- round(FB.norms, dig = 6)
(p.e <- 1/(B + 1) * (1 + sum(FB.norms[-1] >= FB.norms[1])))
## [1] 0.05

This gives always a correct p-value. Again, this p-value cannot be interpreted as borderline, because it is the smallest possible value which can be obtained for the chosen grid.

6. Partial contributions

We have seen that \(F_{\pm}^{(B+1)}(\mathbf{T}_0)\) lies close to the point \((0,-1)^\top\), which indicates that the null hypothesis is rejected mainly due to violation of \(H_{0,2}\), and that the sign of \(\mu_1-\mu_3\) seems to be negative. Alternatively, we can compute the vector of partial contributions, denoted as \(\mathbf{D}\) in the paper.

D <- FB[1, ]/vnorm(FB[1, ])

# contributions in percentages
D^2*100
##       s1       s2 
##  9.54915 90.45085

The contribution of the second test statistic to the rejection of \(H_0\) is 90 %. The first test statistic contributes only by 10 %.