citestR is a lightweight R client for the citest Python package.
It lets you run the conditional-independence-of-missingness test without
using reticulate at runtime — all communication happens
over a local HTTP connection to a FastAPI server that wraps the Python
package.
The chunk below runs during vignette build. It uses mocked HTTP responses, so it exercises the package interface without requiring a live Python backend.
mock_json_response <- function(body, status = 200L) {
function(req) {
httr2::response(
status_code = status,
headers = list("Content-Type" = "application/json"),
body = charToRaw(jsonlite::toJSON(body, auto_unbox = TRUE))
)
}
}
pkg_env <- citestR:::.pkg_env
old_process <- pkg_env$process
old_port <- pkg_env$port
old_base_url <- pkg_env$base_url
on.exit({
pkg_env$process <- old_process
pkg_env$port <- old_port
pkg_env$base_url <- old_base_url
}, add = TRUE)
pkg_env$process <- list(is_alive = function() TRUE)
pkg_env$port <- 9999L
pkg_env$base_url <- "http://127.0.0.1:9999"
# Mock a ci_test response
mock_fit <- mock_json_response(list(
model_id = "test-001",
dataset_id = "ds-001",
results = list(
m = 0.12,
B = 0.04,
W_bar = 0.01,
T = 0.05,
t_k = 2.5,
p_k = 0.017,
p_2s = 0.034,
df = 9
)
))
example_data <- data.frame(
Y = c(1.2, -0.4, 0.7, 0.3, -1.1),
X1 = c(NA, 0.5, -1.1, 0.8, NA),
X2 = c(0.3, 1.4, -0.2, 0.6, 0.9)
)
result <- httr2::with_mocked_responses(mock_fit, {
ci_test(example_data, y = "Y", m = 2L, n_folds = 2L)
})
result
#>
#> Conditional Independence of Missingness Test
#>
#> Mean diff in BCE: 0.12
#> t-statistic: 2.5
#> df: 9
#> p-value: 0.034 (two-sided)
#>
#> Result: Evidence against conditional independence (p < 0.05)If you don’t already have a Python environment with
citest installed, the package provides a helper:
library(citestR)
# Creates a virtualenv called "citest_env" and installs the citest API backend
install_backend(method = "pip")You only need to do this once.
library(citestR)
# Example data frame with some missing values
set.seed(42)
n <- 500
df <- data.frame(
Y = rnorm(n),
X1 = rnorm(n),
X2 = rnorm(n),
X3 = rnorm(n)
)
# Introduce MAR missingness on X2
df$X2[df$X1 > 0.5] <- NA
# Run the CI test (server starts automatically)
result <- ci_test(
data = df,
y = "Y",
imputer = "iterative",
classifier = "rf",
m = 5L,
n_folds = 5L
)
result$resultsThe first call starts the Python server in the background; subsequent calls reuse the running process.