The main purpose of this package is to create descriptive tables for various subgroups in a quick and easy way. Most of the statistics can also be calculated using weights. Below are some examples of packet functionality using artificial data.
With the predifined cell-function iqr_cell one can generate a simple table with interquartile range of a variable. It calculates the median, Q1 and Q3 for bmi variable in data.frame d. The variable to be analysed is selected by setting x_vars='bmi'
. With rows='sex'
the factor to separate the table by rows is selected. Parameter rnames='Sex'
set the label for the row groups.
<- tabular.ade(x_vars='bmi', rows='sex', rnames='Sex', data=d, FUN=iqr_cell)
tab ::kable(tab, caption='Median (Q1/Q3) of BMI') knitr
1 | 2 |
---|---|
Sex | |
Men | 31.2 (27.6/35.2) |
Women | 31.4 (27.8/35.2) |
For a simple 2 x 2 table, the second separation factor for columns needs to be specified, what is done with cols='ethnic'
and cnames='Ethnicity'
.
<-tabular.ade(x_vars='bmi', rows='sex', rnames='Sex', cols='ethnic', cnames='Ethnicity', data=d, FUN=iqr_cell)
tab::kable(tab, caption='Median (Q1/Q3) of BMI') knitr
1 | 2 | 3 | 4 |
---|---|---|---|
Sex | |||
Ethnicity | Other | Caucasian | |
Men | 31.2 (27.6/35.2) | 31.3 (27.6/35.2) | |
Women | 31.5 (27.8/35.2) | 31.3 (27.9/35.2) |
More than one factor at once can be used for rows or columns to create nested tables using rows=c('sex', 'dec'), rnames=c('Sex', 'Decades')
.
<-tabular.ade(x_vars='bmi', rows=c('sex', 'dec'), rnames=c('Sex', 'Decades'),cols='ethnic', cnames='Ethnicity', data=d, FUN=iqr_cell)
tab::kable(tab, caption='Median (Q1/Q3) of BMI') knitr
1 | 2 | 3 | 4 | 5 |
---|---|---|---|---|
Sex | Decades | |||
Ethnicity | Other | Caucasian | ||
Men | (20,30] | 31.5 (28.1/35.5) | 30.9 (27.5/35.1) | |
(30,40] | 31.4 (27.4/35.3) | 31.3 (27.5/34.9) | ||
(40,50] | 31.1 (27.7/35.4) | 31.1 (27.9/35.1) | ||
(50,60] | 31.9 (27.7/35.2) | 31.3 (27.7/34.9) | ||
(60,70] | 31.2 (27.6/35.1) | 31.4 (27.5/35.7) | ||
(70,80] | 30.1 (26.7/34.4) | 31.4 (27.7/35.2) | ||
Women | (20,30] | 30.9 (26.8/35.8) | 31.1 (27.6/35.8) | |
(30,40] | 31.6 (27.5/34.9) | 31.1 (27.7/34.8) | ||
(40,50] | 31.9 (27.8/36.0) | 31.6 (28.0/35.5) | ||
(50,60] | 31.0 (27.9/34.7) | 31.3 (28.2/35.5) | ||
(60,70] | 31.8 (28.2/35.0) | 31.4 (27.8/34.9) | ||
(70,80] | 31.7 (28.8/35.6) | 31.2 (27.9/34.6) |
The cell function n_cell returns the number of non-missing observations in each cell. Missing values of x_vars
variable will be excluded.
<-tabular.ade(x_vars='sex', rows=c('dec','bmi_q'), rnames=c('Decades','BMI Quantiles'), cols=c('sex', 'ethnic'), cnames=c('Sex', 'Ethnicity'), data=d, FUN=n_cell)
tab::kable(tab, caption='N of Obs.') knitr
1 | 2 | 3 | 4 | 5 | 6 | 7 |
---|---|---|---|---|---|---|
Decades | BMI Quantiles | |||||
Sex | Men | Women | ||||
Ethnicity | Other | Caucasian | Other | Caucasian | ||
(20,30] | (14.8,27.7] | 47 | 173 | 64 | 155 | |
(27.7,31.3] | 64 | 167 | 45 | 150 | ||
(31.3,35.2] | 57 | 150 | 42 | 125 | ||
(35.2,65.2] | 58 | 162 | 53 | 164 | ||
(30,40] | (14.8,27.7] | 50 | 172 | 57 | 157 | |
(27.7,31.3] | 37 | 144 | 48 | 169 | ||
(31.3,35.2] | 48 | 164 | 67 | 161 | ||
(35.2,65.2] | 46 | 149 | 50 | 146 | ||
(40,50] | (14.8,27.7] | 63 | 156 | 53 | 145 | |
(27.7,31.3] | 65 | 180 | 42 | 158 | ||
(31.3,35.2] | 55 | 153 | 56 | 157 | ||
(35.2,65.2] | 64 | 161 | 63 | 173 | ||
(50,60] | (14.8,27.7] | 55 | 137 | 47 | 144 | |
(27.7,31.3] | 48 | 139 | 55 | 171 | ||
(31.3,35.2] | 63 | 142 | 57 | 152 | ||
(35.2,65.2] | 56 | 131 | 44 | 162 | ||
(60,70] | (14.8,27.7] | 49 | 167 | 47 | 151 | |
(27.7,31.3] | 52 | 135 | 55 | 160 | ||
(31.3,35.2] | 46 | 145 | 58 | 177 | ||
(35.2,65.2] | 48 | 168 | 50 | 148 | ||
(70,80] | (14.8,27.7] | 60 | 150 | 39 | 141 | |
(27.7,31.3] | 50 | 140 | 54 | 160 | ||
(31.3,35.2] | 33 | 154 | 57 | 163 | ||
(35.2,65.2] | 44 | 154 | 56 | 132 |
With the cell function quantile_cell, quantiles can be calculated. The parameter probs
defines which quantile should be calculated.
<-tabular.ade(x_vars='bmi', xname='BMI', rows=c('sex','ethnic','disease','treat'), rnames=c('Sex', 'Ethnicity', 'Disease', 'Treatment'), data=d, FUN=quantile_cell, probs=0.95)
tab::kable(tab, caption='95th quantile of BMI') knitr
1 | 2 | 3 | 4 | 5 |
---|---|---|---|---|
Sex | Ethnicity | Disease | Treatment | |
Men | Other | no | no | 42.0 |
yes | 41.9 | |||
yes | no | 43.0 | ||
yes | 44.8 | |||
Caucasian | no | no | 41.4 | |
yes | 42.8 | |||
yes | no | 43.2 | ||
yes | 42.0 | |||
Women | Other | no | no | 41.3 |
yes | 41.7 | |||
yes | no | 42.4 | ||
yes | 39.0 | |||
Caucasian | no | no | 41.8 | |
yes | 41.2 | |||
yes | no | 40.4 | ||
yes | 43.7 |
There are several predefined cell functions in this package. See the help pages for more information. The stat_cell
function includes a wide range of statistics and is the most usefull cell function of all.
basic parameters
, digits=3, digits2=1basic parameters
, digits=3, style=1basic parameters
, digits=3, style=1, nsd=1basic parameters
, digits=3, add_n=FALSEbasic parameters
, digits=3, probs=0.5, plabels=FALSEbasic parameters
, digits=1, digits2=0, event=2, type=1basic parameters
, digits=0, pct=FALSE, prefix="“, suffix=”"basic parameters
, digits=3basic parameters
, digits=3Basic parameters are x, y, z, w, cell_ids, row_ids, col_ids, vnames, vars, n_min
. Each cell function must take these parameters. They will be automatically passed from tabular.ade
function. Most of the functions use only the x
variable for calculations and w for weighted calculations. Only corr_p_cell
uses y
variable. Additional parameters like digits = 3
can be used in tabular.ade( , ...)
instead of the points.
There is a possibility to write custom cell function. It allows all possible designs of the cell and much more.
<- function(x, y, z, w, cell_ids, row_ids, col_ids, vnames, vars, n_min)
my_cell
{<- format(mean(x[cell_ids], na.rm=TRUE), digits = 3)
outreturn(out)
}
<-tabular.ade(x_vars='age', rows='sex', rnames='Sex', cols='dec', cnames='Decades', data=d, FUN=my_cell)
tab::kable(tab, caption='Mean Age') knitr
1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 |
---|---|---|---|---|---|---|---|
Sex | |||||||
Decades | (20,30] | (30,40] | (40,50] | (50,60] | (60,70] | (70,80] | |
Men | 25.4 | 35.6 | 45.5 | 55.6 | 65.7 | 75.5 | |
Women | 25.4 | 35.4 | 45.4 | 55.2 | 65.5 | 75.2 |
<- function(x, y, z, w, cell_ids, row_ids, col_ids, vnames, vars, n_min)
my_cell
{<- NULL
out<-table(x[cell_ids])
tabfor(i in 1:length(tab)){
<- paste(out, levels(x)[i],': ' ,tab[i], sep='')
outif(i<length(tab)) out<- paste(out, ', ', sep='')
}return(out)
}
<-tabular.ade(x_vars='sex', rows='dec', rnames='Decades', cols='stage', cnames='Stage', data=d, FUN=my_cell)
tab::kable(tab, caption='Frequencies') knitr
1 | 2 | 3 | 4 | 5 |
---|---|---|---|---|
Decades | ||||
Stage | 1 | 2 | 3 | |
(20,30] | Men: 444, Women: 408 | Men: 341, Women: 307 | Men: 93, Women: 83 | |
(30,40] | Men: 404, Women: 418 | Men: 341, Women: 357 | Men: 65, Women: 81 | |
(40,50] | Men: 441, Women: 432 | Men: 373, Women: 331 | Men: 83, Women: 84 | |
(50,60] | Men: 369, Women: 427 | Men: 330, Women: 330 | Men: 72, Women: 75 | |
(60,70] | Men: 400, Women: 426 | Men: 323, Women: 339 | Men: 87, Women: 81 | |
(70,80] | Men: 391, Women: 365 | Men: 321, Women: 347 | Men: 73, Women: 90 |
<- function(x, y, z, w, cell_ids, row_ids, col_ids, vnames, vars, n_min)
b_cell
{<- NULL
outif(length(unique(x))==2){
<-levels(x)
lv<-sum(x[cell_ids]==lv[2])
n <-sum(table(x[cell_ids]))
N <-paste(levels(x)[2], ': ',format((n/N)*100, digits=3),'% (N:',n , ')',sep='')
out
}if(!is.factor(x) & length(unique(x))> 2){
<- format(quantile(x[cell_ids], c(0.25, 0.5, 0.75), na.rm=TRUE), digits=3)
quant <- paste(quant[1], ' (',quant[2],'/',quant[3],')', sep='')
out
}if(is.factor(x) & length(unique(x))> 2){
<-levels(x)
lv<-table(x[cell_ids])
n <-sum(table(x[cell_ids]))
N <- paste(lv, ': ', format((n/N)*100, digits=3), '%', collapse=' | ', sep='')
out
}return(out)
}
<-tabular.ade(x_vars=c('bmi','ethnic','stage'),xname=c('BMI','Ethnicity','Stages'), cols='sex', cnames='Sex', data=d, FUN=b_cell)
tab::kable(tab, caption='Diverse variables') knitr
1 | 2 | 3 | 4 |
---|---|---|---|
Sex | Men | Women | |
BMI | 27.6 (31.2/35.2) | 27.8 (31.4/35.2) | |
Ethnicity | Caucasian: 74.5% (N:3713) | Caucasian: 74.8% (N:3754) | |
Stages | 1: 49.41% | 2: 41.04% | 3: 9.56% | 1: 49.67% | 2: 40.45% | 3: 9.88% |
<- function(x, y, z, w, cell_ids, row_ids, col_ids, vnames, vars, n_min)
t_test_cell
{<- x[cell_ids]
v <- y[cell_ids]
group <-t.test(v[which(group==levels(group)[1])], v[which(group==levels(group)[2])])
test<- format(diff(test$estimate), digits=3)
mdiff<- base:::format.pval(test$p.value, digits=2, eps=0.0001)
p<- paste('Diff: ', mdiff, ', p-value: ', p, sep='')
outreturn(out)
}
<-tabular.ade(x_vars='bmi', xname='BMI', y_vars='ethnic', yname='Ethnicity', rows='dec', rnames='Decades', cols='sex', cnames='Sex', data=d, FUN=t_test_cell)
tab::kable(tab, caption='T-test for BMI between Ethnicity groups') knitr
1 | 2 | 3 | 4 |
---|---|---|---|
Decades | |||
Sex | Men | Women | |
(20,30] | Diff: -0.483, p-value: 0.24 | Diff: 0.302, p-value: 0.53 | |
(30,40] | Diff: -0.194, p-value: 0.71 | Diff: -0.305, p-value: 0.47 | |
(40,50] | Diff: 0.171, p-value: 0.7 | Diff: -0.0821, p-value: 0.86 | |
(50,60] | Diff: -0.442, p-value: 0.33 | Diff: 0.185, p-value: 0.69 | |
(60,70] | Diff: -0.15, p-value: 0.76 | Diff: -0.0931, p-value: 0.83 | |
(70,80] | Diff: 0.769, p-value: 0.13 | Diff: -0.594, p-value: 0.19 |
There is a possibility to pass more than one variable to x_vars
or x_vars
parameters. In this way a correlation matrix can be created.
<-c('age', 'weight', 'height', 'bmi')
vars <-c('Age', 'Weight', 'Height', 'BMI')
vlabels
<-tabular.ade(x_vars=vars, xname=vlabels, y_vars=vars, yname=vlabels,data=d, FUN=corr_p_cell, digits=2)
tab::kable(tab, caption='Pearson correlation') knitr
1 | 2 | 3 | 4 | 5 |
---|---|---|---|---|
Age | Weight | Height | BMI | |
Age | 1.00 | 0.01 | 0.00 | 0.01 |
Weight | 0.01 | 1.00 | 0.01 | 0.69 |
Height | 0.00 | 0.01 | 1.00 | -0.71 |
BMI | 0.01 | 0.69 | -0.71 | 1.00 |
If there are multiple x variables, then they are listed line by line.
<-c('age', 'weight', 'height', 'bmi')
vars <-c('Age', 'Weight', 'Height', 'BMI')
vlabels
<-tabular.ade(x_vars=vars, xname=vlabels, cols=c('sex','stage'), cnames=c('Sex','Stage'), data=d, FUN=quantile_cell)
tab::kable(tab, caption='Medians') knitr
1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 |
---|---|---|---|---|---|---|---|
Sex | Men | Women | |||||
Stage | 1 | 2 | 3 | 1 | 2 | 3 | |
Age | 49.0 | 49.0 | 50.0 | 50.0 | 51.0 | 50.0 | |
Weight | 80.1 | 79.7 | 80.4 | 80.2 | 80.5 | 80.5 | |
Height | 1.60 | 1.60 | 1.59 | 1.60 | 1.60 | 1.60 | |
BMI | 31.2 | 31.3 | 31.4 | 31.3 | 31.4 | 31.5 |
ALL
keywordThe keyword ALL
, after a factor in rows
or cols
statement, adds a row for overall sample.
<-tabular.ade(x_vars='sex', rows=c('treat', 'ALL'), rnames=c('Treatment'), cols=c('disease', 'ALL'), cnames=c('Disease'), data=d, FUN=n_cell, alllabel='both')
tab::kable(tab, caption='Contingency table') knitr
1 | 2 | 3 | 4 | 5 |
---|---|---|---|---|
Treatment | ||||
Disease | no | yes | both | |
no | 7159 | 833 | 7992 | |
yes | 1798 | 210 | 2008 | |
both | 8957 | 1043 | 10000 |
Most of the predefined cell functions support weighting with w=weights
parameter. This way weighted statistics can be calculated.
<-tabular.ade(x_vars='sex', rows=c('sex', 'ALL', 'ethnic', 'stage'), rnames=c('Sex','Ethnicity', 'Stage'), w='ws', data=d, FUN=n_cell, digits=1)
tab::kable(tab, caption='weighted N') knitr
1 | 2 | 3 | 4 |
---|---|---|---|
Sex | Ethnicity | Stage | |
Men | Other | 1 | 486.1 |
2 | 408.0 | ||
3 | 95.8 | ||
Caucasian | 1 | 1459.6 | |
2 | 1224.7 | ||
3 | 283.6 | ||
Women | Other | 1 | 504.5 |
2 | 413.3 | ||
3 | 107.0 | ||
Caucasian | 1 | 1504.3 | |
2 | 1212.1 | ||
3 | 295.2 | ||
Total | Other | 1 | 990.5 |
2 | 821.2 | ||
3 | 202.7 | ||
Caucasian | 1 | 2963.9 | |
2 | 2436.7 | ||
3 | 578.8 |
The predefined cell functions stat_cell
can calculate several statistics at once. The statistics are set using keywords in x_vars
or y_vars
parameters.
<-c('age', 'weight', 'height', 'bmi')
vars <-c('Age', 'Weight', 'Height', 'BMI')
vlabels
<-c('MIN', 'MAX', 'MEAN', 'SD', 'CV', 'SKEW', 'KURT')
keywords <-c('Min', 'Max', 'Mean', 'SD', 'CV', 'Skewness', 'Kurtosis')
keylabels
<-tabular.ade(x_vars=vars, xname=vlabels, y_vars=keywords, yname=keylabels, data=d, FUN=stat_cell)
tab::kable(tab, caption='Various statistics') knitr
1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 |
---|---|---|---|---|---|---|---|
Min | Max | Mean | SD | CV | Skewness | Kurtosis | |
Age | 20.0 | 80.0 | 49.9 | 17.4 | 0.348 | 0.0150 | -1.20 |
Weight | 35.2 | 115 | 80.0 | 10.0 | 0.125 | -0.0475 | 0.0314 |
Height | 1.19 | 2.02 | 1.60 | 0.101 | 0.0630 | 0.0107 | 0.0500 |
BMI | 14.8 | 65.2 | 31.7 | 5.69 | 0.180 | 0.484 | 0.671 |
rows
parameter<-c('N', 'MIN', 'MAX', 'MEAN', 'SD')
keywords <-c('N', 'Min', 'Max', 'Mean', 'SD')
keylabels
<-tabular.ade(x_vars=vars, xname=vlabels, y_vars=keywords, yname=keylabels, rows=c('sex','ALL','ethnic'), rnames=c('Sex','Ethnicity'), data=d, FUN=stat_cell)
tab::kable(tab, caption='Various statistics') knitr
1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 |
---|---|---|---|---|---|---|---|
Sex | Ethnicity | ||||||
N | Min | Max | Mean | SD | |||
Age | Men | Other | 1268 | 20.0 | 80.0 | 49.7 | 17.2 |
Caucasian | 3713 | 20.0 | 80.0 | 49.8 | 17.6 | ||
Women | Other | 1265 | 20.0 | 80.0 | 50.1 | 17.3 | |
Caucasian | 3754 | 20.0 | 80.0 | 50.1 | 17.2 | ||
Total | Other | 2533 | 20.0 | 80.0 | 49.9 | 17.3 | |
Caucasian | 7467 | 20.0 | 80.0 | 50.0 | 17.4 | ||
Weight | Men | Other | 1268 | 44.2 | 110 | 80.0 | 9.98 |
Caucasian | 3713 | 41.1 | 114 | 79.8 | 10.1 | ||
Women | Other | 1265 | 43.1 | 111 | 80.5 | 10.0 | |
Caucasian | 3754 | 35.2 | 115 | 80.0 | 9.97 | ||
Total | Other | 2533 | 43.1 | 111 | 80.2 | 10.0 | |
Caucasian | 7467 | 35.2 | 115 | 79.9 | 10.0 | ||
Height | Men | Other | 1268 | 1.19 | 1.98 | 1.60 | 0.103 |
Caucasian | 3713 | 1.23 | 2.02 | 1.60 | 0.101 | ||
Women | Other | 1265 | 1.24 | 1.98 | 1.60 | 0.0989 | |
Caucasian | 3754 | 1.21 | 1.95 | 1.60 | 0.101 | ||
Total | Other | 2533 | 1.19 | 1.98 | 1.60 | 0.101 | |
Caucasian | 7467 | 1.21 | 2.02 | 1.60 | 0.101 | ||
BMI | Men | Other | 1268 | 15.0 | 58.7 | 31.7 | 5.88 |
Caucasian | 3713 | 15.8 | 65.2 | 31.6 | 5.68 | ||
Women | Other | 1265 | 15.5 | 64.9 | 31.8 | 5.64 | |
Caucasian | 3754 | 14.8 | 58.6 | 31.7 | 5.66 | ||
Total | Other | 2533 | 15.0 | 64.9 | 31.7 | 5.76 | |
Caucasian | 7467 | 14.8 | 65.2 | 31.7 | 5.67 |
x_vars
parameter<-c('N', 'MIN', 'MAX', 'MEAN', 'SD')
keywords <-c('N', 'Min', 'Max', 'Mean', 'SD')
keylabels
<-tabular.ade(x_vars=keywords, xname=keylabels, y_vars=vars, yname=vlabels, rows=c('sex', 'ALL'), rnames=c('Sex'),data=d, FUN=stat_cell)
tab::kable(tab, caption='Various statistics') knitr
1 | 2 | 3 | 4 | 5 | 6 |
---|---|---|---|---|---|
Sex | |||||
Age | Weight | Height | BMI | ||
N | Men | 4981 | 4981 | 4981 | 4981 |
Women | 5019 | 5019 | 5019 | 5019 | |
Total | 10000 | 10000 | 10000 | 10000 | |
Min | Men | 20.0 | 41.1 | 1.19 | 15.0 |
Women | 20.0 | 35.2 | 1.21 | 14.8 | |
Total | 20.0 | 35.2 | 1.19 | 14.8 | |
Max | Men | 80.0 | 114 | 2.02 | 65.2 |
Women | 80.0 | 115 | 1.98 | 64.9 | |
Total | 80.0 | 115 | 2.02 | 65.2 | |
Mean | Men | 49.8 | 79.9 | 1.60 | 31.6 |
Women | 50.1 | 80.1 | 1.60 | 31.7 | |
Total | 49.9 | 80.0 | 1.60 | 31.7 | |
SD | Men | 17.5 | 10.0 | 0.101 | 5.73 |
Women | 17.3 | 9.98 | 0.100 | 5.66 | |
Total | 17.4 | 10.0 | 0.101 | 5.69 |
<-c('age', 'weight', 'height', 'bmi')
vars <-c('Age', 'Weight', 'Height', 'BMI')
vlabels
<-c('N', 'MEDIAN', 'IQR')
keywords <-c('N', 'Median', 'IQR')
keylabels
<-tabular.ade(x_vars=vars, xname=vlabels, y_vars=keywords, yname=keylabels, rows=c('sex', 'ALL'), rnames=c('Sex'),cols=c('ethnic'),cnames=c('Ethnicity'),w='ws',data=d,FUN=stat_cell)
tab::kable(tab, caption='Various statistics') knitr
1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
---|---|---|---|---|---|---|---|---|
Sex | ||||||||
N | Median | IQR | ||||||
Ethnicity | Other | Caucasian | Other | Caucasian | Other | Caucasian | ||
Age | Men | 990 | 2968 | 49.0 | 49.0 | 29.0 | 31.0 | |
Women | 1025 | 3012 | 50.0 | 50.0 | 30.0 | 29.0 | ||
Total | 2015 | 5979 | 50.0 | 49.0 | 30.0 | 30.0 | ||
Weight | Men | 990 | 2968 | 80.3 | 80.1 | 13.2 | 13.3 | |
Women | 1025 | 3012 | 80.6 | 80.2 | 12.9 | 13.3 | ||
Total | 2015 | 5979 | 80.5 | 80.2 | 13.2 | 13.4 | ||
Height | Men | 990 | 2968 | 1.60 | 1.60 | 0.140 | 0.137 | |
Women | 1025 | 3012 | 1.60 | 1.60 | 0.122 | 0.139 | ||
Total | 2015 | 5979 | 1.60 | 1.60 | 0.132 | 0.138 | ||
BMI | Men | 990 | 2968 | 31.2 | 31.3 | 7.99 | 7.50 | |
Women | 1025 | 3012 | 31.4 | 31.3 | 7.53 | 7.18 | ||
Total | 2015 | 5979 | 31.3 | 31.3 | 7.77 | 7.35 |