VeRUS: Verification of Reference Intervals Based on the Uncertainty of Sampling

Introduction
Description of VeRUS
Application of VeRUS
- Possible Inputs of the verifyRI Function
- Interpretation of the Results
Quantification of the Similarity of Two Reference Intervals
Advanced Applications
References

Introduction

So far, most methods for the verification of reference intervals require the collection of reference samples that reasonably represent the local population. This is often not feasible due to the high costs and the time-consuming process of collecting these samples. The method VeRUS (Verification of Reference Intervals Based on the Uncertainty of Sampling) is a novel approach to verify reference intervals without the collection of samples. Instead, VeRUS combines the already available data from the laboratory information system with the uncertainty of sampling to verify the reference intervals. A detailed description of the method can be found in [https://doi.org/10.1515/cclm-2025-0728].

Description of VeRUS

At the core of VeRUS is the construction of uncertainty margins around the limits of the reference intervals. The uncertainty margins are defined as the approximated confidence interval of population quantiles at a given sample size. As the uncertainty margins are independent of the sample size of the reference sample, they enable direct comparison of reference intervals derived from different sample sizes. These uncertainty margins can be shown when printing the result of findRI. For demonstration purposes, testcase 1 is used as an example for RWD: The print and plot functions can now be used to display the uncertainty margins.

library(refineR)
fit <- findRI(Data = testcase1)
print(fit, uncertaintyRegion = "uncertaintyMargin")

## 
## Estimated Reference Interval
## ------------------------------------------------
## lower limit [ 2.5%]: 10.3 (8.31; 12.3)
## upper limit [97.5%]: 29.9 (27.9; 31.9)
## 
## Width of the uncertainty margin [n = 120]: 90% 
## 
## Model Parameters
## ------------------------------------------------
##      method: refineR (v2.0.0)
##       model: BoxCox
##      N data: 10000
##     rounded: no
##   point est: fullDataEst
## uncertainty: uncertaintyMargin
##      lambda: 1
##          mu: 19.1
##       sigma: 4.99
##       shift: 0
##        cost: -25.6
## NP fraction: 0.803

# The uncertainty margins are shown in the plot
plot(fit, uncertaintyRegion = "uncertaintyMargin")

Application of VeRUS

The standard use case of VeRUS requires only the result of refineR’s findRI function and the candidate reference interval ,i.e., the one that is supposed to be verified. Suppose laboratory A wants to verify the reference interval [9.1, 31.5] for a certain biomarker. The verification can be done as follows:

Ensuring comparable analytical methods and demographics:
- Analytical methods: The analytical methods used in laboratory A should be comparable to those used to establish the reference interval.
- Demographics: The local population of laboratory A should be similar to the population from which the reference interval was derived.
Careful filtering of the RWD according to Ammer et al. 2023 [https://doi.org/10.1093/jalm/jfac101].
Application of the findRI function to model the reference distribution from the filtered RWD.
Using the verifyRI function to verify the reference interval.

The most relevant parameters of the verifyRI function are:

RIdata: (RWDRI) object (result of the findRI function) or a (numeric) vector specifying a reference interval.
RIcand: (RWDRI) object or a (numeric) vector specifying the reference interval that needs to be verified.
RIperc: (numeric) vector specifying the percentiles which define the reference interval (default: c(0.025, 0.975)).
printResults: (logical) indicating whether to print results to the console.
generatePlot: (logical) indicating whether to generate a verification plot.

# Suppose the currently used reference interval is [9.1, 31.5]
verifyRI(RIdata = fit, RIcand = c(9.1, 31.5))

## 
##  Verification of Reference Interval [VeRUS] 
## ------------------------------------------ 
## 
## Verification result: TEST PASSED (all point estimates within uncertainty margins) 
## 
## Perc  | RI Data           | RI Cand           | Overlap         
## ----- | ----------------- | ----------------- | --------------- 
##  2.5% | 10.3 (8.31; 12.3) |  9.1 (6.81; 11.4) | Overlap with PE
## 97.5% | 29.9 (27.9; 31.9) | 31.5 (29.2; 33.8) | Overlap with PE
## 
## 
## Fraction of data within/outside of reference interval:
## Interval | Reference Limits | Below | Within | Above 
## -------- | ---------------- | ----- | ------ | ----- 
##  RI Data |    [10.3 , 29.9] | 12.2% |  76.0% | 11.8%
##  RI Cand |    [ 9.1 , 31.5] | 10.9% |  78.7% | 10.4%

As can be seen, the reference interval [9.1, 31.5] is verified by the method VeRUS. The first table that is printed shows the uncertainty margins around the point estimates of each reference limit. The second table that is printed shows the fraction of samples within the Data argument of findRI that are outside (below or above) or within the respective reference interval.

The verifyRI function is flexible. If only the 99th percentile is relevant, one could verify it as follows:

verifyRI(RIdata = fit, RIcand = 67, RIperc = 0.99)

## 
##  Verification of Reference Interval [VeRUS] 
## ------------------------------------------ 
## 
## Verification result: TEST NOT PASSED 
## 
## Perc  | RI Data           | RI Cand        | Overlap    
## ----- | ----------------- | -------------- | ---------- 
## 99.0% | 31.7 (28.9; 34.5) |  67 ( 60;  74) | No overlap
## 
## 
## Fraction of data within/outside of reference interval:
## Interval | Reference Limits | Within | Above  
## -------- | ---------------- | ------ | ------ 
##  RI Data |      [   , 31.7] |  89.8% |  10.2%
##  RI Cand |      [   ,   67] | 100.0% |   0.0%

However, if more than one reference limit of a numeric RIcand is available, it is recommended to provide at least two of them as a vector (e.g., c(lower, upper)) and their corresponding percentiles as RIperc. Providing only a single value - if RIdata is an RWDRI object and RICand is a numeric vector - necessitates an imputation step which may lead to unexpected results.

Possible Inputs of the verifyRI Function

The verifyRI function can be used to compare:

a RWDRI object (result of the findRI function) with a numeric reference interval
two numeric reference intervals
two RWDRI objects

If the parameters describing the reference distribution from which the candidate RI is derived are known, a custom RWDRI object can be defined and provided to the verifyRI function to improve the accuracy of the verification.

# compare RWDRI object with a numeric reference interval
verifyRI(RIdata = fit, RIcand = c(9.1, 33.5),
 title = "Comparison of RWDRI with Numeric RI",
 printResults = FALSE)
# compare two numeric RIs
verifyRI(RIdata = c(4, 26), RIcand = c(1, 29),
 title = "Comparison of two Numeric RIs",
 printResults = FALSE)

# compare two RWDRI objects
# custom RWDRI object describing the reference distribution
# Mu and Sigma define the mean and standard deviation of a normal distribution.
# Then the inverse Box-Cox transformation is applied to the data with the power parameter Lambda.
# The Shift parameter is used to shift the distribution to the desired location.
custom_RWDRI <- list(Mu = 20, Sigma = 5, Lambda = 0.9, Shift = 0)
class(custom_RWDRI) <- "RWDRI"
verifyRI(RIdata = fit, RIcand = custom_RWDRI,
 title = "Comparison of two RWDRI objects",
 printResults = FALSE)

Interpretation of the Results

The verification of each reference limit may result in one of three outcomes:

The point estimate (PE) of each reference limit is within the corresponding uncertainty margin of the other reference interval. Here the greatest level of confidence in the verification is given. This is indicated by “Overlap with PE” in the first table and by the green color of the overlapping uncertainty margin.
The uncertainty margins of the candidate RI and the local RI overlap but at least one point estimate is outside the uncertainty margin of the other interval. This is indicated by “Overlap of margins” in the first table and by the yellow color of the overlapping uncertainty margin.
The uncertainty margins of the candidate RI and the local RI do not overlap. This is indicated by “No overlap” in the first table and by the red color of the non-overlapping uncertainty margin.

Quantification of the Similarity of Two Reference Intervals

VeRUS can be used not only to verify reference intervals but also to determine the similarity of two reference intervals. The function getRISimilarity can be used for this.

# Suppose the currently used reference interval is [9.1, 55]
getRISimilarity(RIdata = fit, RIcand = c(9.1, 55))

## 
## Similarity Table
## ---------------- 
## 
## Perc  | RI Data | RI Cand | Max Sample Size* | S-Value 
## ----- | ------- | ------- | ---------------- | ------- 
##  2.5% |    10.3 |     9.1 |             3650 | 0.00164
## 97.5% |    29.9 |      55 |           n < 25 |  > 0.24
## 
## 
## *Higher sample sizes indicate greater similarity between the reference limits.

RI similarity is quantified using uncertainty margins (UMs) around reference limits (RLs). UMs, which approximate quantile confidence intervals, narrow with larger sample sizes. The similarity level of two RIs is quantified by the largest sample size, i.e. the smallest UMs, for which the UMs of all corresponding RLs overlap. As sample sizes are not a particularly intuitive measure of similarity, it is also expressed as an s-value. The s-value presents the level of similarity analogous to the p-value: s ≤ 0.05 relates to overlapping CIs of RIs established with the standard non-parametric method with N=120 and therefore indicates the minimal sample size for which the RIs are considered equivalent. Smaller s-values indicate a higher similarity of the RIs and therefore higher certainty of the equivalence of the RIs. The s-value is calculated as follows: s = 6 / Nmin where Nmin is the maximum sample size for which the UMs of all corresponding RLs overlap.

Advanced Applications

Invisible Output of the verifyRI Function

The verifyRI function returns an invisible object. It is a named list with the following elements:

testPassedPointEst (logical) indicating whether all point estimates are within the corresponding uncertainty margins i.e. highest level of confidence in the verification
testPassedMargins (logical) indicating whether all uncertainty margins overlap
RIVerificationTab (data.frame) summarizing the verification results

# We deactivate the printing of the results and plotting
verification_result <- verifyRI(RIdata = fit, RIcand = c(9.1, 33), printResults = FALSE, generatePlot = FALSE)
verification_tab <- verification_result$RIVerificationTab
# The verification table also contains the values of the power parameter (Lambda) of each estimated model.
# The Lambda values of the models of the candidate distribution and the local distribution are the same when and RWDRI object is compared to a numeric candidate RI.
print(paste0("All Lambdas are equal: ", identical(verification_tab$RICandLambda, verification_tab$RIdataLambda)))

## [1] "All Lambdas are equal: FALSE"

# For this example, the Lambda values are irrelevant, as they are the same.
cols_of_interest <- colnames(verification_tab)
cols_of_interest <- cols_of_interest[!cols_of_interest %in% c("RICandLambda", "RIdataLambda")]

knitr::kable(verification_tab[, cols_of_interest])

Percentile	RICandPointEst	RICandMarginLow	RICandMarginHigh	RIDataPointEst	RIDataMarginLow	RIDataMarginHigh	OverlapPointEst	OverlapMargins	RIDataLambda
0.025	9.1	6.654426	11.54557	10.31456	8.31167	12.31745	TRUE	TRUE	1
0.975	33.0	30.554426	35.44557	29.88833	27.88544	31.89122	FALSE	TRUE	1

Non-Default Parameters to Define Uncertainty Margins

By default, the uncertainty margins approximate the 90% confidence intervals expected when estimating the population quantiles RIperc with a sample size of 120 which is the minimum sample size recommended for the non-parametric direct approach of establishing RIs. The width of the uncertainty margins can be adjusted with the parameters:

n (integer) indicating the sample size for which the sampling uncertainty shall be taken into account. Default is 120
UMprop (numeric) defining the desired confidence level for the uncertainty margins. Default is 0.9 (90% approximated confidence interval).

verifyRI(RIdata = custom_RWDRI, RIcand = c(9.1, 31.5), title = "n=120, UMprop=0.9")

## 
##  Verification of Reference Interval [VeRUS] 
## ------------------------------------------ 
## 
## Verification result: TEST NOT PASSED 
## 
## Perc  | RI Data           | RI Cand           | Overlap            
## ----- | ----------------- | ----------------- | ------------------ 
##  2.5% | 13.2 (10.6; 15.8) |  9.1 (   7; 11.3) | Overlap of margins
## 97.5% | 40.3 (37.4; 43.2) | 31.5 (29.1; 33.9) |         No overlap

# increasing n may be desired to get a more strict verification
verifyRI(RIdata = custom_RWDRI, RIcand = c(9.1, 31.5), n = 1000, UMprop = 0.95, title = "n=1000, UMprop=0.95")

## 
##  Verification of Reference Interval [VeRUS] 
## ------------------------------------------ 
## 
## Verification result: TEST NOT PASSED 
## 
## Perc  | RI Data           | RI Cand           | Overlap    
## ----- | ----------------- | ----------------- | ---------- 
##  2.5% | 13.2 (12.1; 14.3) |  9.1 (8.23; 9.98) | No overlap
## 97.5% | 40.3 (39.1; 41.5) | 31.5 (30.5; 32.5) | No overlap

By default, an asymmetry correction is applied to the uncertainty margins. This correction may be deactivated by setting the parameter asymmetryCorr to FALSE. This is only relevant for highly skewed distributions.

The n, UMprop, and asymmetryCorr parameters can also be set in the getRI, print, and plot functions.

plot(fit, uncertaintyRegion = "uncertaintyMargin", n = 1000, UMprop = 0.95, asymmetryCorr = TRUE)

getRI(fit, n = 1000, UMprop = 0.95, asymmetryCorr = TRUE)

##   Percentile PointEst CILow CIHigh     UMLow   UMHigh
## 1      0.025 10.31456    NA     NA  9.487822 11.14130
## 2      0.975 29.88833    NA     NA 29.061590 30.71507

print(fit, uncertaintyRegion = "uncertaintyMargin", n = 1000, UMprop = 0.95, asymmetryCorr = TRUE)

## 
## Estimated Reference Interval
## ------------------------------------------------
## lower limit [ 2.5%]: 10.3 (9.49; 11.1)
## upper limit [97.5%]: 29.9 (29.1; 30.7)
## 
## Width of the uncertainty margin [n = 1000]: 95% 
## 
## Model Parameters
## ------------------------------------------------
##      method: refineR (v2.0.0)
##       model: BoxCox
##      N data: 10000
##     rounded: no
##   point est: fullDataEst
## uncertainty: uncertaintyMargin
##      lambda: 1
##          mu: 19.1
##       sigma: 4.99
##       shift: 0
##        cost: -25.6
## NP fraction: 0.803

Customization of the Plot

The plot of verifyRI can be customized by setting the parameters:

xlab (character) The label of the x-axis.
title (character) The title of the plot.
Scale (character) The scale of the x-axis. Possible values are “original”, “splitXAxis”, and “transformed”. Default is “original”.
candLabel (character) The label for the candidate reference interval.
dataLabel (character) The label for the local reference interval.

scaleOptions <- c("original", "splitXAxis", "transformed")
labels <- c("default", "Laboratory", "Timepoint")
for(i in 1:3) {

if(i == 1){
  candLabel <- dataLabel <- NULL
}
if(i == 2){
  candLabel <- "Lab A"
  dataLabel <- "Lab B"
}
if(i == 3){
  candLabel <- "Timepoint A"
  dataLabel <- "Timepoint B"
}

  verifyRI(
    RIdata = fit, RIcand = c(9.1, 33), printResults = FALSE,
    xlab = paste("x-axis scale:", scaleOptions[i]),
    title = paste("Verification plot with", scaleOptions[i], "scale"),
    Scale = scaleOptions[i],
    candLabel = candLabel,
    dataLabel = dataLabel
  )
}

Advanced Parameters for getRISimilarity

The getRISimilarity function can be customized with the following parameters:

UMprop (numeric) The desired confidence level for the uncertainty margins. Default is 0.9 (90% confidence interval).
asymmetryCorr (logical) If TRUE, an asymmetry correction is applied to the uncertainty margins. Default is TRUE.
Overlap (character) either “OverlapMargins” (uncertainty margins overlap) or “OverlapPointEst” (point estimates are within uncertainty margins). Default is “OverlapMargins”

# Suppose the currently used reference interval is [9.1, 55]
getRISimilarity(RIdata = fit, RIcand = c(9.1, 55), UMprop = 0.95, asymmetryCorr = FALSE, Overlap = "OverlapPointEst")

## 
## Similarity Table
## ---------------- 
## 
## Perc  | RI Data | RI Cand | Max Sample Size* | S-Value 
## ----- | ------- | ------- | ---------------- | ------- 
##  2.5% |    10.3 |     9.1 |              463 |   0.013
## 97.5% |    29.9 |      55 |           n < 25 |  > 0.24
## 
## 
## *Higher sample sizes indicate greater similarity between the reference limits.

Alternative Verification Method

Haeckel et al. proposed using Equivalence Limits [https://doi.org/10.1515/labmed-2016-0002] to verify reference intervals. Equivalence Limits can be applied by setting the parameter marginType to EL.

verifyRI(
  RIdata = fit, RIcand = c(9.1, 33), printResults = TRUE,
  marginType = "EL",
  title = "Verification with Equivalence Limits"
)

## 
##  Verification of Reference Interval [EL] 
## --------------------------------------- 
## 
## Verification result: TEST PASSED (all uncertainty margins overlap) 
## 
## Perc  | RI Data           | RI Cand          | Overlap            
## ----- | ----------------- | ---------------- | ------------------ 
##  2.5% | 10.3 (9.53; 11.1) | 9.1 ( 8.3;  9.9) | Overlap of margins
## 97.5% | 29.9 (  28; 31.7) |  33 (30.8; 35.2) | Overlap of margins
## 
## 
## Fraction of data within/outside of reference interval:
## Interval | Reference Limits | Below | Within | Above 
## -------- | ---------------- | ----- | ------ | ----- 
##  RI Data |    [10.3 , 29.9] | 12.2% |  76.0% | 11.8%
##  RI Cand |    [ 9.1 ,   33] | 10.9% |  81.4% |  7.7%

References

Beck, M., Dufey, F., Ammer, T., Schützenmeister, A., Zierk, J., Rank, C. M., & Rauh, M. (2025). VeRUS: verification of reference intervals based on the uncertainty of sampling. Clinical Chemistry and Laboratory Medicine. https://doi.org/10.1515/cclm-2025-0728.

Ammer, T., Schuetzenmeister, A., Rank, C.M., Doyle, K. Estimation of Reference Intervals from Routine Data Using the refineR Algorithm — A Practical Guide. The Journal of Applied Laboratory Medicine, 8(1):84-91 (2023). https://doi.org/10.1093/jalm/jfac101.

Haeckel, Rainer, Wosniok, Werner and Arzideh, Farhad. Equivalence limits of reference intervals for partitioning of population data. Relevant differences of reference limits. LaboratoriumsMedizin. 40(3):199-205 (2016). https://doi.org/10.1515/labmed-2016-0002.