Introduction

This R cubature package exposes both the hcubature and pcubature routines of the underlying C cubature library, including the vectorized interfaces.

Per the documentation, use of pcubature is advisable only for smooth integrands in dimensions up to three at most. In fact, the pcubature routines perform significantly worse than the vectorized hcubature in inappropriate cases. So when in doubt, you are better off using hcubature.

Version 2.0 of this package integrates the Cuba library as well, once again providing vectorized interfaces.

The main point of this note is to examine the difference vectorization makes. My recommendations are below in the summary section.

A Timing Harness

Our harness will provide timing results for hcubature, pcubature (where appropriate) and Cuba cuhre calls. We begin by creating a harness for these calls.

library(bench)
library(cubature)

harness <- function(which = NULL,
                    f, fv, lowerLimit, upperLimit, tol = 1e-3, iterations = 20, ...) {
  
  fns <- c(hc = "Non-vectorized Hcubature",
           hc.v = "Vectorized Hcubature",
           pc = "Non-vectorized Pcubature",
           pc.v = "Vectorized Pcubature",
           cc = "Non-vectorized cubature::cuhre",
           cc_v = "Vectorized cubature::cuhre")
  cc <- function() cubature::cuhre(f = f,
                                   lowerLimit = lowerLimit, upperLimit = upperLimit,
                                   relTol = tol,
                                   ...)
  cc_v <- function() cubature::cuhre(f = fv,
                                     lowerLimit = lowerLimit, upperLimit = upperLimit,
                                     relTol = tol,
                                     nVec = 1024L,
                                     ...)
  hc <- function() cubature::hcubature(f = f,
                                       lowerLimit = lowerLimit,
                                       upperLimit = upperLimit,
                                       tol = tol,
                                       ...)
  hc.v <- function() cubature::hcubature(f = fv,
                                         lowerLimit = lowerLimit,
                                         upperLimit = upperLimit,
                                         tol = tol,
                                         vectorInterface = TRUE,
                                         ...)
  pc <- function() cubature::pcubature(f = f,
                                       lowerLimit = lowerLimit,
                                       upperLimit = upperLimit,
                                       tol = tol,
                                       ...)
  pc.v <- function() cubature::pcubature(f = fv,
                                         lowerLimit = lowerLimit,
                                         upperLimit = upperLimit,
                                         tol = tol,
                                         vectorInterface = TRUE,
                                         ...)

  ndim <- length(lowerLimit)
  
  if (is.null(which)) {
    fnIndices <- seq_along(fns)
  } else {
    fnIndices <- na.omit(match(which, names(fns)))
  }
  fnList <- lapply(names(fns)[fnIndices], function(x) call(x))
  
  argList <- c(fnList, iterations = iterations, check = FALSE)
  result <- do.call(bench::mark, args = argList)
  d <- result[seq_along(fnIndices), 1:9]
  d$expression <- fns[fnIndices]
  d
}

We reel off the timing runs.

Example 1.

func <- function(x) sin(x[1]) * cos(x[2]) * exp(x[3])
func.v <- function(x) {
    matrix(apply(x, 2, function(z) sin(z[1]) * cos(z[2]) * exp(z[3])), ncol = ncol(x))
}

d <- harness(f = func, fv = func.v,
             lowerLimit = rep(0, 3),
             upperLimit = rep(1, 3),
             tol = 1e-5,
             iterations = 100)[, 1:9]

knitr::kable(d, digits = 3, row.names = FALSE)
expression min median itr/sec mem_alloc gc/sec n_itr n_gc total_time
Non-vectorized Hcubature 120µs 130µs 7677.496 61.8KB 77.550 99 1 12.9ms
Vectorized Hcubature 132µs 138µs 7154.789 54.5KB 72.271 99 1 13.8ms
Non-vectorized Pcubature 350µs 361µs 2697.865 31.2KB 55.058 98 2 36.3ms
Vectorized Pcubature 360µs 372µs 2665.535 58.4KB 54.399 98 2 36.8ms
Non-vectorized cubature::cuhre 227µs 232µs 4284.183 40.5KB 43.275 99 1 23.1ms
Vectorized cubature::cuhre 179µs 185µs 5323.173 39.8KB 53.769 99 1 18.6ms

Multivariate Normal

Using cubature, we evaluate \[ \int_R\phi(x)dx \] where \(\phi(x)\) is the three-dimensional multivariate normal density with mean 0, and variance \[ \Sigma = \left(\begin{array}{rrr} 1 &\frac{3}{5} &\frac{1}{3}\\ \frac{3}{5} &1 &\frac{11}{15}\\ \frac{1}{3} &\frac{11}{15} & 1 \end{array} \right) \] and \(R\) is \([-\frac{1}{2}, 1] \times [-\frac{1}{2}, 4] \times [-\frac{1}{2}, 2].\)

We construct a scalar function (my_dmvnorm) and a vector analog (my_dmvnorm_v). First the functions.

m <- 3
sigma <- diag(3)
sigma[2,1] <- sigma[1, 2] <- 3/5 ; sigma[3,1] <- sigma[1, 3] <- 1/3
sigma[3,2] <- sigma[2, 3] <- 11/15
logdet <- sum(log(eigen(sigma, symmetric = TRUE, only.values = TRUE)$values))
my_dmvnorm <- function (x, mean, sigma, logdet) {
    x <- matrix(x, ncol = length(x))
    distval <- stats::mahalanobis(x, center = mean, cov = sigma)
    exp(-(3 * log(2 * pi) + logdet + distval)/2)
}

my_dmvnorm_v <- function (x, mean, sigma, logdet) {
    distval <- stats::mahalanobis(t(x), center = mean, cov = sigma)
    exp(matrix(-(3 * log(2 * pi) + logdet + distval)/2, ncol = ncol(x)))
}

Now the timing.

d <- harness(f = my_dmvnorm, fv = my_dmvnorm_v,
             lowerLimit = rep(-0.5, 3),
             upperLimit = c(1, 4, 2),
             tol = 1e-5,
             iterations = 10,
             mean = rep(0, m), sigma = sigma, logdet = logdet)
knitr::kable(d, digits = 3)
expression min median itr/sec mem_alloc gc/sec n_itr n_gc total_time
Non-vectorized Hcubature 222.9ms 227.9ms 4.360 139.58KB 55.370 10 127 2.29s
Vectorized Hcubature 719.9µs 734.2µs 1354.788 1.89MB 0.000 10 0 7.38ms
Non-vectorized Pcubature 95.3ms 97.1ms 10.281 0B 55.515 10 54 972.71ms
Vectorized Pcubature 477µs 495.3µs 1990.374 810.26KB 0.000 10 0 5.02ms
Non-vectorized cubature::cuhre 91.4ms 93.5ms 10.517 0B 54.689 10 52 950.84ms
Vectorized cubature::cuhre 929.5µs 941.8µs 1047.349 898.41KB 0.000 10 0 9.55ms

The effect of vectorization is huge. So it makes sense for users to vectorize the integrands as much as possible for efficiency.

Furthermore, for this particular example, we expect mvtnorm::pmvnorm to do pretty well since it is specialized for the multivariate normal. The vectorized versions of hcubature and pcubature seem competitive and in some cases better, if you compare the table above to the one below.

library(mvtnorm)
g1 <- function() pmvnorm(lower = rep(-0.5, m),
                                  upper = c(1, 4, 2), mean = rep(0, m), corr = sigma,
                                  alg = Miwa(), abseps = 1e-5, releps = 1e-5)
g2 <- function() pmvnorm(lower = rep(-0.5, m),
                         upper = c(1, 4, 2), mean = rep(0, m), corr = sigma,
                         alg = GenzBretz(), abseps = 1e-5, releps = 1e-5)
g3 <- function() pmvnorm(lower = rep(-0.5, m),
                         upper = c(1, 4, 2), mean = rep(0, m), corr = sigma,
                         alg = TVPACK(), abseps = 1e-5, releps = 1e-5)

knitr::kable(bench::mark(g1(), g2(), g3(), iterations = 20, check = FALSE)[, 1:9],
             digits = 3, row.names = FALSE)
expression min median itr/sec mem_alloc gc/sec n_itr n_gc total_time
g1() 323µs 581µs 1845.746 355.03KB 0 20 0 10.8ms
g2() 301µs 553µs 1820.018 2.49KB 0 20 0 11ms
g3() 286µs 526µs 1994.843 2.49KB 0 20 0 10ms

Product of cosines

testFn0 <- function(x) prod(cos(x))
testFn0_v <- function(x) matrix(apply(x, 2, function(z) prod(cos(z))), ncol = ncol(x))

d <- harness(f = testFn0, fv = testFn0_v,
             lowerLimit = rep(0, 2), upperLimit = rep(1, 2), iterations = 1000)
knitr::kable(d, digits = 3)
expression min median itr/sec mem_alloc gc/sec n_itr n_gc total_time
Non-vectorized Hcubature 15.1µs 17.1µs 58334.985 7.66KB 58.393 999 1 17.1ms
Vectorized Hcubature 19.8µs 22.1µs 45567.457 1.16KB 45.613 999 1 21.9ms
Non-vectorized Pcubature 17.6µs 18.5µs 53472.985 0B 53.527 999 1 18.7ms
Vectorized Pcubature 35.6µs 36.7µs 26822.919 18.68KB 53.753 998 2 37.2ms
Non-vectorized cubature::cuhre 121µs 126.6µs 7776.688 0B 62.715 992 8 127.6ms
Vectorized cubature::cuhre 105.3µs 110.2µs 8816.691 16.38KB 71.102 992 8 112.5ms

Gaussian function

testFn1 <- function(x) {
    val <- sum(((1 - x) / x)^2)
    scale <- prod((2 / sqrt(pi)) / x^2)
    exp(-val) * scale
}

testFn1_v <- function(x) {
    val <- matrix(apply(x, 2, function(z) sum(((1 - z) / z)^2)), ncol(x))
    scale <- matrix(apply(x, 2, function(z) prod((2 / sqrt(pi)) / z^2)), ncol(x))
    exp(-val) * scale
}

d <- harness(f = testFn1, fv = testFn1_v,
             lowerLimit = rep(0, 3), upperLimit = rep(1, 3), iterations = 10)

knitr::kable(d, digits = 3)
expression min median itr/sec mem_alloc gc/sec n_itr n_gc total_time
Non-vectorized Hcubature 1.01ms 1.02ms 975.748 67.3KB 108.416 9 1 9.22ms
Vectorized Hcubature 1.45ms 1.48ms 673.010 290.5KB 74.779 9 1 13.37ms
Non-vectorized Pcubature 26.12µs 27.12µs 36251.868 0B 0.000 10 0 275.85µs
Vectorized Pcubature 46.25µs 47.17µs 21038.756 4.1KB 0.000 10 0 475.31µs
Non-vectorized cubature::cuhre 4.82ms 4.95ms 203.222 0B 87.095 7 3 34.45ms
Vectorized cubature::cuhre 6.03ms 6.04ms 165.472 971.5KB 110.315 6 4 36.26ms

Discontinuous function

testFn2 <- function(x) {
    radius <- 0.50124145262344534123412
    ifelse(sum(x * x) < radius * radius, 1, 0)
}

testFn2_v <- function(x) {
    radius <- 0.50124145262344534123412
    matrix(apply(x, 2, function(z) ifelse(sum(z * z) < radius * radius, 1, 0)), ncol = ncol(x))
}

d <- harness(which = c("hc", "hc.v", "cc", "cc_v"),
             f = testFn2, fv = testFn2_v,
             lowerLimit = rep(0, 2), upperLimit = rep(1, 2), iterations = 10)
knitr::kable(d, digits = 3)
expression min median itr/sec mem_alloc gc/sec n_itr n_gc total_time
Non-vectorized Hcubature 14.3ms 14.8ms 64.635 17.7KB 96.953 10 15 154.72ms
Vectorized Hcubature 14.1ms 14.4ms 69.021 1011.8KB 96.630 10 14 144.88ms
Non-vectorized cubature::cuhre 271.6ms 276.6ms 3.553 0B 74.979 10 211 2.81s
Vectorized cubature::cuhre 255.6ms 262.6ms 3.798 21.2MB 69.887 10 184 2.63s

A Simple Polynomial (product of coordinates)

testFn3 <- function(x) prod(2 * x)
testFn3_v <- function(x) matrix(apply(x, 2, function(z) prod(2 * z)), ncol = ncol(x))

d <- harness(f = testFn3, fv = testFn3_v,
             lowerLimit = rep(0, 3), upperLimit = rep(1, 3), iterations = 50)
knitr::kable(d, digits = 3)
expression min median itr/sec mem_alloc gc/sec n_itr n_gc total_time
Non-vectorized Hcubature 24.3µs 26.1µs 37488.813 6.75KB 0.00 50 0 1.33ms
Vectorized Hcubature 30.9µs 31.6µs 31024.512 2.91KB 0.00 50 0 1.61ms
Non-vectorized Pcubature 21.6µs 22.3µs 43750.825 0B 0.00 50 0 1.14ms
Vectorized Pcubature 28.2µs 29.2µs 33357.477 19.12KB 0.00 50 0 1.5ms
Non-vectorized cubature::cuhre 264.5µs 270.2µs 3644.598 0B 74.38 49 1 13.45ms
Vectorized cubature::cuhre 190.5µs 199.8µs 4948.027 39.84KB 100.98 49 1 9.9ms

Gaussian centered at \(\frac{1}{2}\)

testFn4 <- function(x) {
    a <- 0.1
    s <- sum((x - 0.5)^2)
    ((2 / sqrt(pi)) / (2. * a))^length(x) * exp (-s / (a * a))
}

testFn4_v <- function(x) {
    a <- 0.1
    r <- apply(x, 2, function(z) {
        s <- sum((z - 0.5)^2)
        ((2 / sqrt(pi)) / (2. * a))^length(z) * exp (-s / (a * a))
    })
    matrix(r, ncol = ncol(x))
}

d <- harness(f = testFn4, fv = testFn4_v,
             lowerLimit = rep(0, 2), upperLimit = rep(1, 2), iterations = 20)
knitr::kable(d, digits = 3)
expression min median itr/sec mem_alloc gc/sec n_itr n_gc total_time
Non-vectorized Hcubature 480.23µs 495.07µs 2005.473 85.2KB 105.551 19 1 9.47ms
Vectorized Hcubature 546.53µs 576.62µs 1740.550 147.2KB 91.608 19 1 10.92ms
Non-vectorized Pcubature 734.64µs 763.17µs 1306.317 0B 68.754 19 1 14.54ms
Vectorized Pcubature 815.65µs 837.1µs 1189.970 68.9KB 62.630 19 1 15.97ms
Non-vectorized cubature::cuhre 1.22ms 1.24ms 805.426 0B 42.391 19 1 23.59ms
Vectorized cubature::cuhre 1.11ms 1.14ms 859.780 125.6KB 95.531 18 2 20.94ms

Double Gaussian

testFn5 <- function(x) {
    a <- 0.1
    s1 <- sum((x - 1 / 3)^2)
    s2 <- sum((x - 2 / 3)^2)
    0.5 * ((2 / sqrt(pi)) / (2. * a))^length(x) * (exp(-s1 / (a * a)) + exp(-s2 / (a * a)))
}
testFn5_v <- function(x) {
    a <- 0.1
    r <- apply(x, 2, function(z) {
        s1 <- sum((z - 1 / 3)^2)
        s2 <- sum((z - 2 / 3)^2)
        0.5 * ((2 / sqrt(pi)) / (2. * a))^length(z) * (exp(-s1 / (a * a)) + exp(-s2 / (a * a)))
    })
    matrix(r, ncol = ncol(x))
}

d <- harness(f = testFn5, fv = testFn5_v,
             lowerLimit = rep(0, 2), upperLimit = rep(1, 2), iterations = 20)
knitr::kable(d, digits = 3)
expression min median itr/sec mem_alloc gc/sec n_itr n_gc total_time
Non-vectorized Hcubature 1.32ms 1.34ms 740.809 133KB 82.312 18 2 24.3ms
Vectorized Hcubature 1.46ms 1.54ms 659.112 249KB 73.235 18 2 27.3ms
Non-vectorized Pcubature 885.97µs 911.39µs 1099.329 0B 57.859 19 1 17.3ms
Vectorized Pcubature 1.04ms 1.07ms 933.854 69KB 103.762 18 2 19.3ms
Non-vectorized cubature::cuhre 2.55ms 2.6ms 385.834 0B 68.088 17 3 44.1ms
Vectorized cubature::cuhre 2.67ms 2.77ms 362.021 224KB 90.505 16 4 44.2ms

Tsuda’s Example

testFn6 <- function(x) {
    a <- (1 + sqrt(10.0)) / 9.0
    prod( a / (a + 1) * ((a + 1) / (a + x))^2)
}

testFn6_v <- function(x) {
    a <- (1 + sqrt(10.0)) / 9.0
    r <- apply(x, 2, function(z) prod( a / (a + 1) * ((a + 1) / (a + z))^2))
    matrix(r, ncol = ncol(x))
}

d <- harness(f = testFn6, fv = testFn6_v,
             lowerLimit = rep(0, 3), upperLimit = rep(1, 3), iterations = 20)
knitr::kable(d, digits = 3)
expression min median itr/sec mem_alloc gc/sec n_itr n_gc total_time
Non-vectorized Hcubature 566.87µs 591.3µs 1697.030 64.6KB 89.317 19 1 11.2ms
Vectorized Hcubature 655.67µs 680.97µs 1470.313 156KB 77.385 19 1 12.9ms
Non-vectorized Pcubature 3.05ms 3.1ms 322.840 0B 80.710 16 4 49.6ms
Vectorized Pcubature 3.09ms 3.22ms 312.259 386.2KB 78.065 16 4 51.2ms
Non-vectorized cubature::cuhre 1.58ms 1.61ms 618.790 0B 68.754 18 2 29.1ms
Vectorized cubature::cuhre 1.49ms 1.53ms 642.970 225.8KB 71.441 18 2 28ms

Morokoff & Calflish Example

testFn7 <- function(x) {
    n <- length(x)
    p <- 1/n
    (1 + p)^n * prod(x^p)
}
testFn7_v <- function(x) {
    matrix(apply(x, 2, function(z) {
        n <- length(z)
        p <- 1/n
        (1 + p)^n * prod(z^p)
    }), ncol = ncol(x))
}

d <- harness(f = testFn7, fv = testFn7_v,
             lowerLimit = rep(0, 3), upperLimit = rep(1, 3), iterations = 20)
knitr::kable(d, digits = 3)
expression min median itr/sec mem_alloc gc/sec n_itr n_gc total_time
Non-vectorized Hcubature 1.21ms 1.24ms 811.394 32.89KB 90.155 18 2 22.2ms
Vectorized Hcubature 1.21ms 1.22ms 815.052 205.61KB 90.561 18 2 22.1ms
Non-vectorized Pcubature 2.99ms 3.08ms 325.866 0B 108.622 15 5 46ms
Vectorized Pcubature 2.72ms 2.94ms 342.490 386.24KB 85.622 16 4 46.7ms
Non-vectorized cubature::cuhre 14.61ms 14.68ms 68.110 0B 612.994 2 18 29.4ms
Vectorized cubature::cuhre 12.26ms 12.36ms 80.381 2.04MB 455.490 3 17 37.3ms

Wang-Landau Sampling 1d, 2d Examples

I.1d <- function(x) {
    sin(4 * x) *
        x * ((x * ( x * (x * x - 4) + 1) - 1))
}
I.1d_v <- function(x) {
    matrix(apply(x, 2, function(z)
        sin(4 * z) *
        z * ((z * ( z * (z * z - 4) + 1) - 1))),
        ncol = ncol(x))
}
d <- harness(f = I.1d, fv = I.1d_v,
             lowerLimit = -2, upperLimit = 2, iterations = 100)
knitr::kable(d, digits = 3)
expression min median itr/sec mem_alloc gc/sec n_itr n_gc total_time
Non-vectorized Hcubature 56.2µs 57.9µs 16640.94 53.7KB 168.090 99 1 5.95ms
Vectorized Hcubature 72.5µs 74µs 13381.10 68.5KB 0.000 100 0 7.47ms
Non-vectorized Pcubature 20.8µs 21.6µs 46197.96 0B 0.000 100 0 2.17ms
Vectorized Pcubature 50.3µs 52.3µs 19181.88 0B 193.756 99 1 5.16ms
Non-vectorized cubature::cuhre 95.2µs 98µs 10188.71 0B 0.000 100 0 9.81ms
Vectorized cubature::cuhre 159.6µs 165.8µs 5925.25 0B 59.851 99 1 16.71ms
I.2d <- function(x) {
    x1 <- x[1]; x2 <- x[2]
    sin(4 * x1 + 1) * cos(4 * x2) * x1 * (x1 * (x1 * x1)^2 - x2 * (x2 * x2 - x1) +2)
}
I.2d_v <- function(x) {
    matrix(apply(x, 2,
                 function(z) {
                     x1 <- z[1]; x2 <- z[2]
                     sin(4 * x1 + 1) * cos(4 * x2) * x1 * (x1 * (x1 * x1)^2 - x2 * (x2 * x2 - x1) +2)
                 }),
           ncol = ncol(x))
}
d <- harness(f = I.2d, fv = I.2d_v,
             lowerLimit = rep(-1, 2), upperLimit = rep(1, 2), iterations = 100)
knitr::kable(d, digits = 3)
expression min median itr/sec mem_alloc gc/sec n_itr n_gc total_time
Non-vectorized Hcubature 1.81ms 1.87ms 533.812 78.4KB 65.977 89 11 166.7ms
Vectorized Hcubature 1.64ms 1.72ms 580.982 304.7KB 71.807 89 11 153.2ms
Non-vectorized Pcubature 161.29µs 167.28µs 5846.189 0B 59.052 99 1 16.9ms
Vectorized Pcubature 191.92µs 204.88µs 4873.923 18.3KB 49.232 99 1 20.3ms
Non-vectorized cubature::cuhre 480.27µs 502.62µs 1970.367 0B 60.939 97 3 49.2ms
Vectorized cubature::cuhre 419.68µs 444.28µs 2246.679 60.1KB 45.851 98 2 43.6ms

Implementation Notes

About the only real modification we have made to the underlying cubature library is that we use M = 16 rather than the default M = 19 suggested by the original author for pcubature. This allows us to comply with CRAN package size limits and seems to work reasonably well for the above tests. Future versions will allow for such customization on demand.

The changes made to the Cuba library are managed in a Github repo branch: each time a new release is made, we update the main branch, and keep all changes for Unix platforms in a branch named R_pkg against the current main branch. Customization for windows is done in the package itself using the Makevars.win script.

Summary

The recommendations are:

  1. Vectorize your function. The time spent in so doing pays back enormously. This is easy to do and the examples above show how.

  2. Vectorized hcubature seems to be a good starting point.

  3. For smooth integrands in low dimensions (\(\leq 3\)), pcubature might be worth trying out. Experiment before using in a production package.

Session Info

sessionInfo()
## R version 4.5.2 (2025-10-31)
## Platform: aarch64-apple-darwin20
## Running under: macOS Tahoe 26.1
## 
## Matrix products: default
## BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib 
## LAPACK: /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.12.1
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## time zone: America/Los_Angeles
## tzcode source: internal
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] mvtnorm_1.3-3    cubature_2.1.4-1 bench_1.1.4     
## 
## loaded via a namespace (and not attached):
##  [1] digest_0.6.38     R6_2.6.1          fastmap_1.2.0     xfun_0.54        
##  [5] magrittr_2.0.4    glue_1.8.0        cachem_1.1.0      tibble_3.3.0     
##  [9] knitr_1.50        pkgconfig_2.0.3   htmltools_0.5.8.1 rmarkdown_2.30   
## [13] lifecycle_1.0.4   profmem_0.7.0     cli_3.6.5         vctrs_0.6.5      
## [17] sass_0.4.10       jquerylib_0.1.4   compiler_4.5.2    tools_4.5.2      
## [21] pillar_1.11.1     evaluate_1.0.5    bslib_0.9.0       Rcpp_1.1.0       
## [25] yaml_2.3.10       rlang_1.1.6       jsonlite_2.0.0