Package {rocvb}


Type: Package
Title: ROC-Based Inference for Diagnostic Accuracy Under Verification Bias
Version: 0.1.0
Maintainer: Shirui Wang <wangshirui1021@gmail.com>
Description: Provides point estimates and confidence intervals for receiver operating characteristic (ROC)–based diagnostic accuracy metrics for tests and biomarkers subject to verification bias. Supported metrics include the Area Under the ROC Curve (AUC), the Youden index, and the sensitivity at a user‑specified specificity level for two‑class continuous tests under missing‑at‑random (MAR) disease verification. Point estimation follows Alonzo and Pepe (2005) <doi:10.1111/j.1467-9876.2005.00477.x>. Multiple types of confidence intervals are implemented and compared, including bootstrap‑based, Method of Variance Estimates Recovery (MOVER)–based, and empirical likelihood (EL)–based intervals; see Wang et al. (2025) <doi:10.1177/09622802251322989> and https://github.com/swang1021/rocvb.
License: MIT + file LICENSE
Encoding: UTF-8
RoxygenNote: 7.3.3
Imports: emplik, ggplot2, grid, MASS, pROC, stats
Suggests: testthat (≥ 3.0.0)
Config/testthat/edition: 3
URL: https://github.com/swang1021/rocvb
BugReports: https://github.com/swang1021/rocvb/issues
NeedsCompilation: no
Packaged: 2026-05-01 09:22:20 UTC; Ray
Author: Shirui Wang [aut, cre]
Repository: CRAN
Date/Publication: 2026-05-05 15:06:10 UTC

Confidence Intervals for AUC Under MAR Verification

Description

Computes point estimates and confidence intervals for the AUC of a continuous test when disease verification is missing at random (MAR). The function returns four estimates simultaneously, obtained using the bias-corrected estimators FI, MSI, IPW, and SPE proposed by Alonzo and Pepe (2005).

Usage

auc.ci.mar(
  Test,
  D,
  A,
  alpha = 0.05,
  search_step = 0.01,
  tol = 1e-05,
  precision = 1e-04,
  n.boot = 1000,
  plot = TRUE
)

Arguments

Test

Test results; a positive numeric vector.

D

Verified disease status; a logical vector with possible missing values.

A

Covariate; a positive numeric vector. Only one covariate is allowed.

alpha

Significance level for the confidence interval. Default is 0.05.

search_step

Step size used in root searching. Default is 0.01.

tol

Tolerance used in root searching. Default is 1e-5.

precision

Precision parameter used in the regression model. Default is 1e-4.

n.boot

Number of bootstrap replicates. Default is 1000.

plot

Logical; if TRUE (default) a density plot is produced.

Details

Bootstrap and hybrid empirical likelihood confidence intervals for AUC under verification bias are computed.

The disease model \rho is estimated using a probit regression model linear in Test and A based on verified subjects, given by

\rho_i = P(D_i = 1 \mid T_i, A_i) = \Phi(\alpha + \beta T_i + \gamma A_i), \quad i = 1, \ldots, n.

where \Phi denotes the standard normal cumulative distribution function.

The verification model is estimated using a logit regression model linear in Test and A based on all subjects, given by

\operatorname{logit}(\pi_i) = \log\!\left( \frac{\pi_i}{1 - \pi_i} \right) = \alpha + \beta T_i + \gamma A_i, \quad i = 1, \ldots, n,

where \pi_i = P(V_i = 1 \mid T_i, A_i).

The function may also produce a density plot of the test measurements when plot = TRUE.

Value

A list with elements:

n.total

Total number of subjects.

n.case

Number of verified diseased subjects.

n.control

Number of verified non-diseased subjects.

p.missing

Proportion of missing verification.

pt.est

Point estimates of AUC.

BC.intervals

Bootstrap classic (BC) confidence intervals.

BP.intervals

Bootstrap percentile (BP) confidence intervals.

HEL1.intervals

Hybrid empirical likelihood confidence intervals, type I.

HEL2.intervals

Hybrid empirical likelihood confidence intervals, type II.

References

Alonzo, T. A. and Pepe, M. S. (2005). Assessing accuracy of a continuous screening test in the presence of verification bias. Journal of the Royal Statistical Society: Series C (Applied Statistics).

Wang, S., Shi, S., and Qin, G. (2026). Empirical likelihood inference for the area under the ROC curve with verification-biased data. Manuscript under peer review.

Examples

set.seed(123)
Test <- abs(rnorm(100))
A <- abs(rnorm(100))
D <- as.logical(Test + A > stats::quantile(Test + A, 0.8))
D[sample(100, 30)] <- NA
auc.ci.mar(Test, D, A, n.boot = 20, plot = FALSE)

Confidence Intervals for Sensitivity at Fixed Level of Specificity Under MAR Verification

Description

Computes point estimates and confidence intervals for sensitivity of a continuous test at a fixed level of specificity when disease verification is missing at random (MAR). The function returns four estimates simultaneously, obtained using the bias-corrected estimators FI, MSI, IPW, and SPE proposed by Alonzo and Pepe (2005).

Usage

sen.ci.mar(
  Test,
  D,
  A,
  p,
  alpha = 0.05,
  search_step = 0.01,
  tol = 1e-05,
  precision = 1e-04,
  n.boot = 1000,
  plot = TRUE
)

Arguments

Test

Test results; a positive numeric vector.

D

Verified disease status; a logical vector with possible missing values.

A

Covariate; a positive numeric vector. Only one covariate is allowed.

p

Target specificity level; a number between 0 and 1.

alpha

Significance level for the confidence interval. Default is 0.05.

search_step

Step size used in root searching. Default is 0.01.

tol

Tolerance used in root searching. Default is 1e-5.

precision

Precision parameter used in the regression model. Default is 1e-4.

n.boot

Number of bootstrap replicates. Default is 1000.

plot

Logical; if TRUE (default) a density plot is produced.

Details

The function targets sensitivity evaluated at specificity level p (i.e., sensitivity at the threshold achieving specificity p). Bootstrap, hybrid empirical likelihood and influence function-based empirical likelihood confidence intervals are computed as returned in the list.

The disease model \rho is estimated using a probit regression model linear in Test and A based on verified subjects, given by

\rho_i = P(D_i = 1 \mid T_i, A_i) = \Phi(\alpha + \beta T_i + \gamma A_i), \quad i = 1, \ldots, n.

where \Phi denotes the standard normal cumulative distribution function.

The verification model is estimated using a logit regression model linear in Test and A based on all subjects, given by

\operatorname{logit}(\pi_i) = \log\!\left( \frac{\pi_i}{1 - \pi_i} \right) = \alpha + \beta T_i + \gamma A_i, \quad i = 1, \ldots, n,

where \pi_i = P(V_i = 1 \mid T_i, A_i).

The function may also produce a density plot of the test measurements when plot = TRUE.

Value

A list with elements:

n.total

Total number of subjects.

n.case

Number of verified diseased subjects.

n.control

Number of verified non-diseased subjects.

p.missing

Proportion of missing verification.

pt.est

Point estimates of sensitivity at specificity p.

pt.est.ac

Point estimates of sensitivity at specificity p using the Agresti–Coull method.

AC.intervals

Agresti–Coull-based confidence intervals.

WS.intervals

Wilson score-based confidence intervals.

BTI.intervals

Bootstrap confidence intervals, type I.

BTII.intervals

Bootstrap confidence intervals, type II.

HEL1.intervals

Hybrid empirical likelihood confidence intervals, type I.

HEL2.intervals

Hybrid empirical likelihood confidence intervals, type II.

IFEL1.intervals

Influence Function-based empirical likelihood confidence intervals, type I.

IFEL2.intervals

Influence Function-based empirical likelihood confidence intervals, type II.

References

Alonzo, T. A. and Pepe, M. S. (2005). Assessing accuracy of a continuous screening test in the presence of verification bias. Journal of the Royal Statistical Society: Series C (Applied Statistics).

Wang, S., Shi, S., and Qin, G. (2026). Empirical likelihood-based confidence intervals for sensitivity of a continuous test at a fixed level of specificity with verification bias. Manuscript under peer review.

Examples

set.seed(123)
Test <- abs(rnorm(100))
A <- abs(rnorm(100))
D <- as.logical(Test + A > stats::quantile(Test + A, 0.8))
D[sample(100, 30)] <- NA
sen.ci.mar(Test, D, A, p = 0.8, n.boot = 20, plot = FALSE)


Confidence Intervals for Youden Index Under MAR Verification

Description

Computes point estimates and confidence intervals for maximum Youden index of a continuous test when disease verification is missing at random (MAR). The function returns four estimates simultaneously, obtained using the bias-corrected estimators FI, MSI, IPW, and SPE proposed by Alonzo and Pepe (2005).

Usage

yi.ci.mar(
  Test,
  D,
  A,
  alpha = 0.05,
  precision = 1e-04,
  n.boot = 1000,
  plot = TRUE
)

Arguments

Test

Test results; a positive numeric vector.

D

Verified disease status; a logical vector with possible missing values.

A

Covariate; a positive numeric vector. Only one covariate is allowed.

alpha

Significance level for the confidence interval. Default is 0.05.

precision

Precision parameter used in the regression model. Default is 1e-4.

n.boot

Number of bootstrap replicates. Default is 1000.

plot

Logical; if TRUE (default) a density plot is produced.

Details

Bootstrap and MOVER-based confidence intervals are computed for the maximum Youden index.

The disease model \rho is estimated using a probit regression model linear in Test and A based on verified subjects, given by

\rho_i = P(D_i = 1 \mid T_i, A_i) = \Phi(\alpha + \beta T_i + \gamma A_i), \quad i = 1, \ldots, n.

where \Phi denotes the standard normal cumulative distribution function.

The verification model is estimated using a logit regression model linear in Test and A based on all subjects, given by

\operatorname{logit}(\pi_i) = \log\!\left( \frac{\pi_i}{1 - \pi_i} \right) = \alpha + \beta T_i + \gamma A_i, \quad i = 1, \ldots, n,

where \pi_i = P(V_i = 1 \mid T_i, A_i).

The function may also produce a density plot of the test measurements when plot = TRUE.

Value

A list with elements:

n.total

Total number of subjects.

n.case

Number of verified diseased subjects.

n.control

Number of verified non-diseased subjects.

p.missing

Proportion of missing verification.

pt.est

Point estimates of the maximum Youden index.

pt.est.ac

Point estimates of the maximum Youden index using the Agresti–Coull method.

optimal.cutoff

Optimal cutoff point of test results that maximizes the Youden index.

Wald.intervals

Wald confidence intervals.

BCI.intervals

Bootstrap classic confidence intervals, type I.

BCII.intervals

Bootstrap classic confidence intervals, type II.

BPac.intervals

Bootstrap percentile confidence intervals.

MOVERac.intervals

MOVER confidence intervals using the Agresti–Coull method.

MOVERws.intervals

MOVER confidence intervals using the Wilson score method.

References

Alonzo, T. A. and Pepe, M. S. (2005). Assessing accuracy of a continuous screening test in the presence of verification bias. Journal of the Royal Statistical Society: Series C (Applied Statistics).

Wang, S., Shi, S., and Qin, G. (2025). Interval estimation for the Youden index of a continuous diagnostic test with verification biased data. Statistical Methods in Medical Research.

Examples

set.seed(123)
Test <- abs(rnorm(100))
A <- abs(rnorm(100))
D <- as.logical(Test + A > stats::quantile(Test + A, 0.8))
D[sample(100, 30)] <- NA
yi.ci.mar(Test, D, A, n.boot = 20, plot = FALSE)