若需要一些personalized check过程或者说company-specific check，那么`sdtmchecks`

R包能提供一些非常有用的支持。就像`sdtmchecks`

包介绍中所说的，其并不是想囊括所有SDTM check rules，也不是P21 data validation的复制替代品，其主要是想提供一个一般化且可操作并且有意义的data check。

`sdtmchecks`

包囊括的方法和函数不多，应该说非常精简，但是都非常实用；这个也是我非常喜欢的方式，目标明确且提供适量的函数可供调用，学习曲线平缓。

首先我们从github上安装其主分支下的版本

`# install.packages("devtools")devtools::install_github("pharmaverse/sdtmchecks", ref="main")`

加载`sdtmchecks`

包后我可以先初步浏览下其提供了多少内置的data check类型；当然也可用在网页端查阅，如：Search Data Check Functions，其提供了非常详尽的 check details，以便我们理解最终报告中展示的内容。

`#Just type this insdtmchecksmeta`

接着我导入当前目录下所有SDTM数据集

`fn <- list.files( path = "./SDTM", pattern = "sas7bdat", full.names = TRUE)for (file in fn) { f <- stringr::str_remove(basename(file), ".sas7bdat") assign(f, value = haven::read_sas(file))}`

然后可以选择run一个data check函数，函数名则可以从`sdtmchecksmeta`

对象中获取，如`check_ae_aedecod(ae)`

；或者run所有的check，如下所示：

`myreport <- run_all_checks( metads = sdtmchecksmeta, priority = c("High", "Medium", "Low"), # subset checks based on priority type = c("ALL", "ONC", "PRO", "OPHTH"), # subset checks based category verbose = TRUE)`

最后生成报告来审阅所有的data issues

`report_to_xlsx(res = myreport, outfile = "sdtm_check_report.xlsx")`

但是不得不说，包内嵌的general check可能不太适用于我们的一些项目，这时需要比较精细的评估当前项目的sdtm需要哪些check模块，如：

`# Subset to checks that should work OK for most datasetsmetads = sdtmchecksmeta %>% filter(check %in% c("check_ae_aedecod", "check_ae_aetoxgr", "check_ae_dup", "check_cm_cmdecod", "check_cm_missing_month", "check_dm_age_missing", "check_dm_usubjid_dup", "check_dm_armcd" ))myreport <- run_all_checks(metads = metads, verbose = TRUE)`

必要的时候甚至可以自定义一些check，如：Writing a New Check。

以上均来自`sdtmchecks`

包的文档，更多信息可查阅https://github.com/pharmaverse/sdtmchecks

`mice`

package but left out the Ls-means and hypothesis test. Luckly I find out that `emmeans`

package have wrapped this process inside so we can use it to obtain the pooled Ls-means estimation and p-value straightforward wihout `pool`

function of `mice`

.Supposed that we have fitted the ANCOVA for imputed datasets and get the fitted models `mods`

for each imputation here. Then I will use the `emmeans::emmeans()`

function to estimate the ls-means, which is not the indivival estimate for each imputation but rather the pooled one. The pool process remains to use the Rubin's Rules.

`ems <- emmeans::emmeans(mods, ~trt)data.frame(ems)## trt emmean SE df lower.CL upper.CL## 1 1 -10.55330 0.4696385 194.9729 -11.47953 -9.627078## 2 2 -12.39073 0.4698075 194.8239 -13.31729 -11.464169`

Afterwards using the `emmeans::contrast()`

function to do the contrasts analysis and get the CI and p-value for the difference (`trt2 - trt1`

).

`conr <- emmeans::contrast(ems, method = list(c(-1, 1)), adjust = "none")conr_test <- emmeans::test(conr)data.frame(conr_test)## contrast estimate SE df t.ratio p.value## 1 c(-1, 1) -1.837429 0.6657861 194.8238 -2.759788 0.006335778`

If we want to validate the above result using SAS procedure, we must first export the csv from the `low_imp_res`

dataframe that we created in the last article.

`write.csv(low_imp_res, file = "./low_imputed.csv", row.names = FALSE, na = "")`

If we want to validate the above result using SAS procedure, we must first export the csv from the `low_imp_res`

dataframe that we created in the last article. And than fit the ANCOVA model with `proc mixed`

to obtain the ls-means estimate(`lsm`

and `diff`

) for each imputation. Finally use `proc mianalyze`

to pool the results of all imputation for ls-means (`comb_lsm`

) and difference (`comb_diff`

) within two groups.

`proc import datafile="&_projpth.\02_Raw Data\low_imputed.csv" out=low_imp(where=(imp ne 0)) dbms=csv replace; getnames = yes;run;ods output lsmeans=lsm diffs=diff; proc mixed data=low_imp; by imp; class trt / ref=first; model week8=basval trt /ddfm=kr; lsmeans trt / cl pdiff diff;run;proc sort data=lsm; by trt; run;ods output ParameterEstimates=comb_lsm; proc mianalyze data=lsm; by trt; modeleffects estimate; stderr stderr;run;ods output ParameterEstimates=comb_diff; proc mianalyze data=diff; by trt _trt; modeleffects estimate; stderr stderr;run;`

The SAS output can be seen below.

We can observe that the estimate and SE from SAS are consistent with R, but there is a significant discrepancy in DF (degress of freedom). In R df is `197`

while in SAS it is `3.94E6`

. That's because there are different methods to calculate the the df, an older one is used in SAS and adjusted version is used in the `mice`

package. That will also lead to different p-values. For more details can be found in Degrees of Freedom and P-values of Rubins-Rules.

Someone may be curious how to calculate the df in `mice`

and SAS. Let’s repeat the process of calculation using the formulas in above link. The specific formulas are not shown here, which is very easy to understand. So I just convert them as the R code.

The older method used in SAS to calculate the df for the t-distribution is defined as (Rubin (1987), Van Buuren (2018)). The specific formulas are not shown here, which is very easy to understand. So I just convert them as the R code. From below, the `lambda`

can be derived from the between (`Vb`

) and total (`Vt`

) missing data variance, and the `m`

represents the number of imputed datasets. The rounded value is `3.94E6`

that is equal to the results in SAS.

`m <- 5Vb <- 0.000372Vt <- 0.443271lambda <- (Vb + Vb / m) / Vtdf_old <- (m - 1) / lambda^2df_old## [1] 3944121`

As the above `df`

is too larger for the pooled result, compared to the dfs in each imputed dataset, which is inappropriate. Barnard and Rubin (1999) adjusted this df by using a new formula (See formula 9.9 in that article.). We should compute the Observed df (`df_obs`

) and then adjusted df (`df_adj`

) where `n`

represents the sample size in the imputed datasets, and `k`

the number of parameters in the fitted model (in my case, there are 3 parameters).

`n <- 200k <- 3m <- 5df_obs <- (((n - k) + 1) / ((n - k) + 3)) * (n - k) * (1 - lambda)df_adj <- (df_old * df_obs) / (df_old + df_obs)df_adj## [1] 194.824`

Finally we can look at the df for the pooled estimates from `pool_res`

using `pool()`

function. The `df`

for `trt2`

term is about equal to our computation, as seen below.

`pool_res <- pool(mods)pool_res## Class: mipo m = 5 ## term m estimate ubar b t dfcom df riv## 1 (Intercept) 5 0.6767737 2.982354968 6.679090e-03 2.990369876 197 194.4394 0.002687443## 2 basval 5 -0.5354029 0.006510448 1.426767e-05 0.006527569 197 194.4534 0.002629804## 3 trt2 5 -1.8374286 0.442824415 3.722857e-04 0.443271157 197 194.8238 0.001008849## lambda fmi## 1 0.002680240 0.01278278## 2 0.002622906 0.01272531## 3 0.001007832 0.01110765`

The `df`

calculation for pooling process in `emmeans`

package for `mina`

class has kept the consistency with `mice`

package, using the the Barnard-Rubin adjustment for small samples (Barnard and Rubin, 1999) that mentioned in the `pool()`

documents, see https://github.com/rvlenth/emmeans/issues/494. Thus we can get the same `df`

in either the `mice`

or `emmeans`

packages. All above results have been updated.

In R, we can use the `mice`

package (Multiple Imputation with Chained Equations) to perform multiple imputation where there is an option for the linear regression method.

The chained equations is a variation of a Gibbs Sampler (an MCMC approach) that iterates between drawing estimates of missing values and estimates of parameters for distribution of the variable (both conditional on the other variables).

And what's the linear regression imputation and what are the advantages and disadvantages of it? Please see the below explaination from this article (Multiple Imputation).

In regression imputation, the existing variables are used to predict, and then the predicted value is substituted as if an actually obtained value. This approach has several advantages because the imputation retains a great deal of data over the listwise or pairwise deletion and avoids significantly altering the standard deviation or the shape of the distribution. However, as in a mean substitution, while a regression imputation substitutes a value predicted from other variables, no novel information is added, while the sample size has been increased and the standard error is reduced.

Then let's jump into implementation with the `mice`

package. Here I will use the example data set (`low1.sas7bdat`

) from Mallinckrodt et al. (https://journals.sagepub.com/doi/pdf/10.1177/2168479013501310) which is available via https://www.lshtm.ac.uk/research/centres-projects-groups/missing-data#dia-missing-data.

`low1 <- haven::read_sas("./low1.sas7bdat")head(low1)## # A tibble: 6 × 6## PATIENT POOLINV trt basval week change## <dbl> <chr> <chr> <dbl> <dbl> <dbl>## 1 1005 101 2 16 1 -3## 2 1005 101 2 16 2 -5## 3 1005 101 2 16 4 -10## 4 1005 101 2 16 6 -11## 5 1005 101 2 16 8 -13## 6 1006 101 2 17 1 -1`

As shown above, this is a long format data set, including few variables, and the meaning of them is straightforward to understand literally. In order to meet the `mice`

functions, I will first convert it to wide format with separte variables for each time points (week1 - week8).

`low_wide <- low1 %>% pivot_wider(names_from = week, names_prefix = "week", values_from = change) %>% select(-POOLINV)head(low_wide)## # A tibble: 6 × 8## PATIENT trt basval week1 week2 week4 week6 week8## <dbl> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>## 1 1005 2 16 -3 -5 -10 -11 -13## 2 1006 2 17 -1 -2 -6 -10 -12## 3 1008 1 32 -6 -12 -17 -20 -22## 4 1011 1 18 -1 -5 -8 NA NA## 5 1012 1 22 -6 -9 -13 -16 -17## 6 1015 2 29 -6 -14 -14 -20 -25`

Then let's have a look at the missing pattern of this example.

`low_wide %>% select(basval, week1, week2, week4, week8) %>% mutate(across(1:5, function(x) { if_else(is.na(x), ".", "X") })) %>% group_by_all() %>% count(name = "Freq") ## # A tibble: 4 × 6## # Groups: basval, week1, week2, week4, week8 [4]## basval week1 week2 week4 week8 Freq## <chr> <chr> <chr> <chr> <chr> <int>## 1 X X . . . 4## 2 X X X . . 3## 3 X X X X . 9## 4 X X X X X 184`

Besides you can also use the `mice::md.pattern(low_wide)`

function to display the missing patterns that is similar as the above outputs.

As shown above, the 'X' marks indicate that the data point in a certain visit is completed and '.' marks indicate the data is fully missing. So we can suppose that the example data is monotonic type and missingness is MAR for later analysis purposes.

Now we will generate 5 imputed datasets via `mice`

function by setting `m=5`

and `method = 'norm.predict'`

(called Linear regression through prediction).

`low_imp <- mice(low_wide, method = "norm.predict")summary(low_imp)`

If you would like to check if the missingness are all completed, via `low_imp$imp`

. And you can also specify which imputed datasets to use via setting `action`

argument. The `action = 0`

will return the orginal dataset with missing values, and `action = 1`

corresponds to the first imputed datasets.

`low_imp_1 <- complete(low_imp, action = 1) `

And maybe we'd like to have all imputed datasets in a long format that will be easy to handle and analyze at some point.

`low_imp_res <- complete(low_imp, action = "long", include = TRUE)`

If you already have an imputed dataset with long format from another imputation method, `mice::as.mids()`

would be a helpful function that can convert it to an object with `mids`

class for the further analysis in `mice`

. The `mids`

class should contains the orginal data as well as imputed datasets with `.imp`

and `.id`

columns inside.

`mids <- as.mids(low_imp_res)`

The second step is to fit the ANCOVA model for `week8`

time point with treatment (`trt`

) as independent variable, change from baseline in week 8 (`week8`

) as response variable, and baseline (`basval`

) as covariates.

`mods <- with( low_imp, lm(week8 ~ basval + trt))`

The `mods`

is an object with `mira`

class that contains the call and fitted model object for each one of the imputations.

And the final step is to integrate the results from ANCOVA models using Rubin's Rules (https://bookdown.org/mwheymans/bookmi/rubins-rules.html), which is also the method used by `proc mianalyze`

in SAS

`pool_res <- pool(mods)summary(pool_res)## term estimate std.error statistic df p.value## 1 (Intercept) 0.5759012 1.73148889 0.3326046 194.3680 7.397912e-01## 2 basval -0.5329080 0.08089651 -6.5875270 194.3860 4.086519e-10## 3 trt2 -1.7308118 0.66660413 -2.5964612 194.7855 1.013719e-02`

We can see the pooled estimations like coefficient and standard error in the above output, which is obtained by `coef()`

and `vcov()`

functions from pooled models such as `lm`

here. The estimated treatment coefficient is around `-1.73`

with a standard error around `0.67`

.

Actually the coefficient and standard error are not the final results we would like to display in the clinical report. We should get the LS-means estimation for each group, and do contrast between two groups and hypothesis test. That will be discussed in the next topic.

Multiple Imputaton: Linear Regression Flexible Imputation of Missing Data

Missing Data (Rough) Notes

Multiple Imputation with the mice Package

Multiple imputation without a specialist R package

Multiple Imputation

之前可能只有在药厂内部的一些需求会使用R或者Python，但现在随着大药厂开始陆陆续续尝试使用R生成的结果来提交给监管机构，以及伴随着递交数据的改变（如转变成json格式），后续R或者Python在临床统计分析中应该会有一个完整的分析-呈现-递交的工作流。

以下是我对于R包`rtables`

的初步尝试，比如我们想生成一张在肿瘤试验中常见的生存分析的表格，如下所示（非递交用途）。

但不得不说假如真要使用`rtables`

包来生成临床分析表格，最好还是搭配`tern`

包来使用；前者主要是生成表格，后者则是用来分析数据，而且是无缝连接。

假如不想使用`tern`

包的话，就得用另外一种使用方式，即写自定义函数来分析，然后在`rtables`

的函数里调用，但这样的话就不如`tern`

+ `rtables`

这种方式来得便捷。

步入正题，首先我先加载`whas500`

数据集用于后续的生存分析（Kaplan-Meier），主要用到其中的3个变量

- AFB: Atrial Fibrillation (0 = No, 1 = Yes)
- LENFOL: Total Length of Follow-up (Days between Date of Last Follow-up and Hospital Admission Date)
- FSTAT: Vital Status at Last Follow-up (0 = Alive 1 = Dead)

并对分组变量`AFB`

做一些处理，拟合KM模型，其中置信区间的方法采用log-log转化

`library(survival)library(tidyverse)library(rtables)data("whas500", package = "stabiot")dat <- whas500 %>% dplyr::mutate( AFB = case_when( AFB == 1 ~ "Yes", TRUE ~ "No" ), AFB = factor(AFB, levels = c("Yes", "No")) )km_fit <- survfit( data = dat, formula = Surv(LENFOL, FSTAT) ~ AFB, conf.int = 0.95, conf.type = "log-log")`

接着分析中位、25th和75th的生存时间以及log-rank检验，并对分析结果做一些调整使得其能更加容易匹配在`rtables`

语法（即避免在`rtables`

自定义函数中调整数据格式）

`# median survival timesurv_med <- summary(km_fit)$table# quantile survival timesurv_quant <- quantile(km_fit, probs = c(0.25, 0.75)) %>% purrr::map(\(df) as.data.frame(df) %>% rownames_to_column(var = "group")) %>% purrr::list_rbind() %>% mutate( stat = rep(c("est", "lower", "upper"), each = 2) ) %>% pivot_longer(cols = c("25", "75"), names_to = "quantile") %>% pivot_wider( names_from = c("stat", "quantile"), names_glue = "Q{quantile}_{stat}", values_from = "value" ) %>% column_to_rownames(var = "group")# test survival curvessurv_pval <- survminer::surv_pvalue(km_fit, method = "log-rank")`

然后分析在12、36和60月时的生存率以及两组间率的比较，后者主要是通过Z检验来计算CI和P值。

`# survival rate at 12/36/60 monthstp_cols <- c("time", "n.risk", "n.event", "n.censor", "surv", "std.err", "lower", "upper")surv_rate <- summary(km_fit, times = c(12, 36, 60), extend = TRUE)[tp_cols] %>% as.data.frame() %>% split(~time) %>% purrr::map(\(df) magrittr::set_rownames(df, c("AFB=Yes", "AFB=No")))# difference of survival rate by time-pointsurv_rate_diff <- surv_rate %>% purrr::map(function(x){ tibble::tibble( time = unique(x$time), surv.diff = diff(x$surv), std.err = sqrt(sum(x$std.err^2)), lower = surv.diff - stats::qnorm(1 - 0.05 / 2) * std.err, upper = surv.diff + stats::qnorm(1 - 0.05 / 2) * std.err, pval = if (is.na(std.err)) { NA } else { 2 * (1 - stats::pnorm(abs(surv.diff) / std.err)) } ) })`

以上是完成了分析的步骤，其实这些常规的分析步骤都已经包括在`tern`

包里了。

接下来则是写自定义的函数，用于在`rtables`

中的`analyze()`

函数。

首先汇总两组的事件数和删失数，则需要先计算输入数据集中`FSTAT`

变量的0/1分类数目，然后除以每组人数`.N_col`

来得到百分比。`in_rows()`

代表多行分析，即每个输入代表一行结果，`format`

参数来定义输出格式

`a_count_subjd <- function(df, .N_col) { in_rows( "Number of events" = rcell( sum(df$FSTAT == 1) * c(1, 1 / .N_col), format = "xx (xx.xx%)" ), "Number of consered" = rcell( sum(df$FSTAT == 0) * c(1, 1 / .N_col), format = "xx (xx.xx%)" ) )}`

接着汇总各个分位数下的生存时间估计及其CI，以及生存时间的最大最小值。在这个自定义函数中，我使用了额外参数(`med_tb`

和`quant_tb`

)，分别对应中位数和25th/75th分位数的数据集，其中`ind`

变量可用于从上述数据集中找到对应组别的那行结果

`a_surv_time_func <- function(df, .var, med_tb, quant_tb) { ind <- grep(df[[.var]][1], row.names(med_tb), fixed = TRUE) med_time = list(med_tb[ind, c("median", "0.95LCL", "0.95UCL")]) quantile_time = lapply(c("25", "75"), function(x) { unlist(c(quant_tb[ind, grep(paste0("Q", x), names(quant_tb))])) }) range_time = list(range(df[["LENFOL"]])) in_rows( .list = c(med_time, quantile_time, range_time), .names = c( "Median (95% CI)", "25th percentile (95% CI)", "75th percentile (95% CI)", "Min, Max" ), .formats = c( "xx.xx (xx.xx - xx.xx)", "xx.xx (xx.xx - xx.xx)", "xx.xx (xx.xx - xx.xx)", "(xx.xx, xx.xx)" ) )}`

然后汇总两组的log-rank检验的P值，`pval_tb`

对应其数据集

`a_surv_pval_func <- function(df, .var, .in_ref_col, pval_tb) { in_rows( "P-value" = non_ref_rcell( pval_tb[["pval"]], .in_ref_col, format = "x.xxxx | (<0.0001)" ) )}`

最后汇总12、36和60月时的生存率以及两组间率的比较，`rate_tb`

和`rate_diff_tb`

分别对应各个time point的生存率和组间率差的数据集。其中`non_ref_rcell()`

可用于reference group需要为空的情况，`indent_mod`

参数则可以调整缩进尺度（默认是0，即不缩进）

`a_surv_rate_func <- function(df, .var, .in_ref_col, rate_tb, rate_diff_tb) { ind <- grep(df[[.var]][1], row.names(rate_tb), fixed = TRUE) in_rows( rcell(rate_tb[ind, "n.risk", drop = TRUE], format = "xx"), rcell(rate_tb[ind, "surv", drop = TRUE], format = "xx.xx"), rcell(unlist(rate_tb[ind, c("lower", "upper"), drop = TRUE]), format = "(xx.xx, xx.xx)"), non_ref_rcell( rate_diff_tb[, "surv.diff", drop = TRUE], .in_ref_col, format = "xx.xx" ), non_ref_rcell( unlist(rate_diff_tb[, c("lower", "upper"), drop = TRUE]), .in_ref_col, format = "(xx.xx, xx.xx)", indent_mod = 1L ), non_ref_rcell( rate_diff_tb[, "pval", drop = TRUE], .in_ref_col, format = "x.xxxx | (<0.0001)", indent_mod = 1L ), .names = c( "Number at risk", "Event-free rate (%)", "95% CI", "Difference in Event Free Rate (%)", "95% CI", "p-value (Z-test)" ) )}`

完成上述各个自定义函数后，接着是则进入`rtables`

的layout部分，由于我们的这次的表格比较简单，所以所用的函数不复杂。`basic_table()`

完成基本表格元素的设定，如title和footnote等；`split_cols_by()`

设定表格分组变量以及定义reference group；接着就是各个分析的模块，在`analyze()`

中分别调用上述的自定义函数即可，其中12/36/60月需要多次调用，因此用for循环来实现。最后用`build_table()`

函数调用已完成的layout和数据集来生成最终的表格。

`result <- basic_table( show_colcounts = TRUE, title = "Table 14.2.1.1: Summary of Efficacy Evaluated") |> split_cols_by("AFB", ref_group = "AFB=Yes") |> analyze("AFB", a_count_subjd, show_labels = "hidden") |> analyze("AFB", a_surv_time_func, var_labels = "Time to event (months)", show_labels = "visible", extra_args = list(med_tb = surv_med, quant_tb = surv_quant), table_names = "kmtable" ) |> analyze("AFB", a_surv_pval_func, var_labels = "Unstratified log-rank test", show_labels = "visible", extra_args = list(pval_tb = surv_pval), table_names = "logrank" )time_point <- c(12, 36, 60)for (i in seq_along(time_point)) { result <- result |> analyze("AFB", a_surv_rate_func, var_labels = paste(time_point[i], "months"), show_labels = "visible", extra_args = list(rate_tb = surv_rate[[i]], rate_diff_tb = surv_rate_diff[[i]]), table_names = paste0("timepoint_", time_point[i]) )}result |> build_table(dat %>% mutate(AFB = str_c("AFB=", AFB)))`

以上是我对于`rtables`

包的粗略理解，详细的教程可参考：https://insightsengineering.github.io/rtables/latest-tag/中的一些文档，以及一些已做分享的presentations（https://insightsengineering.github.io/rtables/latest-tag/#presentations）

现在网上关于用R语言来完成临床分析和图表生成的中文教程相对较少，希望这个简单的分享的能帮助到大家，若有出错的地方还请随时告知

https://insightsengineering.github.io/rtables/latest-tag/

https://insightsengineering.github.io/tlg-catalog/stable/tables/efficacy/ttet01.html

https://pharmaverse.r-universe.dev/articles/rtables/introduction.html

https://www.pharmasug.org/proceedings/japan2023/PharmaSUG-Japan-2023-05.pdf https://www.r-consortium.org/all-projects/tables-in-clinical-trials-with-r#rtables

Normally we will show the primary analyses, like below:

- Descriptive statistics of the number of events and censors.
- Median (and 25th, 75th percentile) survival time from Kaplan-Meier estimate, along with 95% CI that will be calculated via Brookmeyer and Crowley methodology using log-log transformation.
- Survival rate at each time-point of interest from Kaplan-Meier estimate, along with 95% CI that will be calculated via Greenwood formula using log-log transformation.
- Hazard Ratio or stratified Hazard Ratio, along with 95% CI from Cox proportional hazards (PH) model, adding Efron approximation for ties handling.
- P-value of log-rank test or stratified log-rank test.

Above are the most prevalent survival analysis methods in the Statistical Analysis Plan (SAP) for oncology trials. Let's see how to implement them in R.

In this blog, I will use the example data from the Worcester Heart Attack Study (https://stats.idre.ucla.edu/sas/seminars/sas-survival/) with 500 subjects, which has been wrapped in `stabiot`

R package. And you can find the description of all columns in `?whas500`

after installing the package, like `devtools::install_github("kaigu1990/stabiot")`

.

`library(survival)library(stabiot)data("whas500")`

If we want to compare the survival time between the subjects with and without atrial fibrillation, we should first convert the `AFB`

variable to a factor.

`dat <- whas500 %>% mutate( AFB = factor(AFB, levels = c(1, 0)) )`

Afterwards we can compute the Kaplan-Meier estimate of the survival function for the `whas500`

dataset.

`fit_km <- survfit(Surv(LENFOL, FSTAT) ~ AFB, data = dat, conf.type = "log-log")`

In the `Surv()`

function, the event variable takes on the value 1 for events and 0 for censoring, which is in contrast to SAS. And the `conf.type = "log-log"`

tells the function to estimate the CI of median or other percentiles via Brookmeyer and Crowley methodology using log-log transformation because the default argument is `conf.type = "log"`

.

Then we can use `summary()`

to see more detail or obtain the median survival time.

`print(summary(fit_km), digits = 4)# median survival time with CIsummary(fit_km)$table## records n.max n.start events rmean se(rmean) median 0.95LCL 0.95UCL## AFB=1 78 78 78 47 35.86989 3.821604 28.41889 13.76591 45.24025## AFB=0 422 422 422 168 48.63073 1.714196 70.96509 51.77823 NA`

Or use `quantile()`

for any quantile estimate.

`# 25% 50% and 75% survival time and CIquantile(fit_km, probs = c(0.25, 0.5, 0.75)) ## $quantile## 25 50 75## AFB=1 3.12115 28.41889 77.20739## AFB=0 11.33470 70.96509 77.30595## ## $lower## 25 50 75## AFB=1 0.5585216 13.76591 50.85832## AFB=0 6.1437372 51.77823 77.30595## $upper## 25 50 75## AFB=1 10.77618 45.24025 NA## AFB=0 17.41273 NA NA`

If you want to know the survival rate at specific time points like 12, 24 and 36 months, use `times = c(12, 36, 60)`

in the `summary()`

function.

`summary(fit_km, times = c(12, 36, 60))## Call: survfit(formula = Surv(LENFOL, FSTAT) ~ AFB, data = dat, conf.type = "log-log")## ## AFB=1 ## time n.risk n.event survival std.err lower 95% CI upper 95% CI## 12 50 28 0.641 0.0543 0.524 0.736## 36 27 12 0.455 0.0599 0.335 0.567## 60 11 6 0.315 0.0643 0.195 0.441## ## AFB=0 ## time n.risk n.event survival std.err lower 95% CI upper 95% CI## 12 312 110 0.739 0.0214 0.695 0.779## 36 199 32 0.645 0.0244 0.595 0.690## 60 77 21 0.530 0.0311 0.467 0.589`

The `n.risk`

column gives us the number of subjects who are still in the risk condition at specific time points. The `n.event`

column demonstrates the number of events that occurred at the time. And `survival`

column tells us the survival rate from the KM estimate and the last two columns are the corresponding CI.

In addition, you may be interested in the difference rate and corresponding CI between groups with and without AFB. Now that you know the rate and SE for two groups seperately, thus the difference rate and difference SE can be simply calculated. Afterwards for CI calculation, utilize the `qnorm()`

function as follows.

`diff_rate <- diff(rate_tb$surv)diff_se <- sqrt(sum(rate_tb$std.err^2))diff_rate + c(-1, 1) * qnorm(1 - 0.05 / 2) * diff_se## [1] -0.01608841 0.21271012`

The Log-rank test is a non-parametric test for comparing the survival function across two or more groups where the null hypothesis is that the groups's survival functions are the same. It can be calculated via `survminer::surv_pvalue()`

function with `method = "log-rank"`

for `survfit`

object, or `survival::survdiff()`

function with `rho = 0`

. Both are part of the default set, so you don't need to define them explicitly. Let me show them separately, as shown below.

`survminer::surv_pvalue(fit_km, method = "log-rank")## variable pval method pval.txt## 1 AFB 0.0009616214 Log-rank p = 0.00096survival::survdiff(Surv(LENFOL, FSTAT) ~ AFB, data = dat, rho = 0)$pvalue## [1] 0.0009616214`

If you would like to know what is the Log rank test, this article (Log Rank Test) can be for your reference.

As we will know, the Cox regression model is a semi-parametric model since it makes no assumption about the distribution of the event times that is similar to the KM method of non-parametric, but it relies on a partial likelihood estimation that is partially defined parametrically. Before fitting the Cox model, we should make sure the proportional hazard assumption is met. And more details can be seen at http://www.sthda.com/english/wiki/cox-model-assumptions.

Let's go on the `whas500`

example data. If you want to estimate the hazard ratio comparing those two groups and also specify the efron approximation for tie handling, the `survival::coxph()`

function can be used for fitting Cox PH models simply.

`fit_cox <- coxph(Surv(LENFOL, FSTAT) ~ AFB, data = dat, ties = "efron")## Call:## coxph(formula = Surv(LENFOL, FSTAT) ~ AFB, data = dat, ties = "efron")## ## n= 500, number of events= 215 ## ## coef exp(coef) se(coef) z Pr(>|z|) ## AFB0 -0.5397 0.5829 0.1654 -3.263 0.0011 **## ---## Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1## ## exp(coef) exp(-coef) lower .95 upper .95## AFB0 0.5829 1.716 0.4215 0.8061## ## Concordance= 0.537 (se = 0.014 )## Likelihood ratio test= 9.58 on 1 df, p=0.002## Wald test = 10.64 on 1 df, p=0.001## Score (logrank) test = 10.9 on 1 df, p=0.001`

The Cox results can be interpreted as follows: - The `coef`

is the coefficient, and `z`

is the Wald statistic value that corresponds to the rate of `coef`

to its standard error (`se(coef)`

). - Hazard ratio (HR) corresponds to the `exp(coef)`

, which is comparing the current level to the reference level. So the HR of `0.58`

indicates that the subjects without AFB have 0.58 less hazard or risk compared to those who have AFB. In other words, if the event occurred in 20% of the no AFB group, it would occur in 8.4% (20% - (20% x 0.58)) of the AFB group, which means the no AFB can reduce the hazard of deaths by 0.58. - The HR confidence interval is also provided, with lower 95% bound of 0.4215 and upper 95% bound of 0.8061. - There are three alternative tests for overall significance of the model: likelihood-ratio test, Wald test, and score log-rank statistics. And as we know, log-rank test is a special case of Cox model, which means it equals a univariate Cox regression (only considering treatment). So we can calculate the log-rank p-value from 'survdiff()` as well for the Cox model.

The stratified log-rank test is commonly used for randomized clinical trials when there are baseline factors that may be related to the treatment effect.

The stratified log-rank test is the log-rank test that accounts for the difference in prognostic factors between the two groups. Specifically, we divide the data according to the levels of the significant prognostic factors and form a stratum for each level. At each level, we arrange the survival times in ascending order and calculate the observed number of events, expected number of events, and variance at each survival time as we would in the regular log-rank test. (Chapter 23 - An Introduction to Survival Analysis)

To implement the stratified log-rank test, simply include the `strata()`

within the survival model formula as follows.

`strat_km <- survfit(Surv(LENFOL, FSTAT) ~ AFB + strata(AGE, GENDER), data = dat, conf.type = "log-log")survminer::surv_pvalue(strat_km, method = "log-rank")## variable pval method pval.txt## 1 AFB+strata(AGE, GENDER) 0.08269744 Log-rank p = 0.083`

Regarding the stratified Cox model, it can be used if there are one or more predictors that don’t satisfy the proportional hazard assumptions. In other words, the proportional hazard is violated.

It can also be performed using `coxph`

along with `strata()`

function, the same as stratified log-rank test.

`strat_cox <- coxph(Surv(LENFOL, FSTAT) ~ AFB + strata(AGE, GENDER), data = dat, ties = "efron")summary(strat_cox)strat_cox %>% broom::tidy(exponentiate = TRUE, conf.int = TRUE, conf.level = 0.95) %>% select(term, estimate, conf.low, conf.high)## # A tibble: 1 × 4## term estimate conf.low conf.high## <chr> <dbl> <dbl> <dbl>## 1 AFB0 0.695 0.462 1.05`

Above is a summary of common survival analyses in R. In the next step, I would like to wrap these functions into one or two functions and specify the print method so that we can simply use them to compare with SAS.

https://stats.stackexchange.com/questions/486806/the-logrank-test-statistic-is-equivalent-to-the-score-of-a-cox-regression-is-th

https://discourse.datamethods.org/t/when-is-log-rank-preferred-over-univariable-cox-regression/2344

Introduction to Regression Methods for Public Health Using R

Survival Analysis in R Companion

Survival Analysis Using R

Survival analysis in clinical trials — Log-rank test

Survival analysis in clinical trials — Kaplan-Meier estimator

Although there are lots of blogs on Google that will tell you how to derive BOR in SAS, only a few people will use R to do so. This article is to talk about how to implement BOR with or without confirmation in R.

Firstly, let's look at the programming logic for BOR without confirmation.

- Set to complete response (CR) if one CR exists.
- Set to partial response (PR) if one PR exists .
- Set to stable disease (SD) if one SD exists, which meets the minimum requirement for SD duration from treatment (or randomization) start to the date of the response.
- Set to progressive disease (PD) if one PD exists.
- Set to not estimable (NE) if only NE exists or the response cannot meet minimum SD duration criteria.

Afterwards, you can select the best response above for each subject as the BOR.

Below are the rules to evaluate the best response where the confirmation of CR and PR is required for BOR deviation, from RECIST guideline (https://ctep.cancer.gov/protocolDevelopment/docs/recist_guideline.pdf)

Actually, we don't consider the scenario where the CR is followed by PR, but we must consider the scenario where subsequent response is not sequential. And we also need to consider how many NEs are acceptable between response and confirmatory response.

Thus the programming logic for confirmed BOR can be summarized as following:

- Set to complete response (CR) if there is one confirmatory CR at least a minimum number of days (e.g., 28 days) later, all responses between the two should only be "CR" or "NE", and there are no more than a maximum NE (e.g., one NE) between two responses.
- Set to partial response (PR) if there is one confirmatory CR or PR at least a minimum number of days (e.g., 28 days) later, all responses between the two should only be are "CR", "PR" or "NE", and there are no more than a maximum NE (e.g., one NE) between two PR/CR responses.
- Set to stable disease (SD) if there is one CR, PR or SD that meets the minimum requirement for the duration from treatment (or randomization) start to the date of that response.
- Set to progressive disease (PD) if one PD exists.
- Set to not estimable (NE) if there is at least one CR, PR, SD, NE.

And then like the unconfirmed BOR, you can select the best response above for each subject as the confirmed BOR.

I have created a function in `stabiot`

R package following the rules we discussed above. For example, let's try it using the `derive_bor()`

function as shown below. More detials can be found in `?derive_bor`

.

`# This example is referred from `admiral::event_joined`.adrs <- tibble::tribble( ~USUBJID, ~TRTSDTC, ~ADTC, ~AVALC, "1", "2020-01-01", "2020-01-01", "PR", "1", "2020-01-01", "2020-02-01", "CR", "1", "2020-01-01", "2020-02-16", "NE", "1", "2020-01-01", "2020-03-01", "CR", "1", "2020-01-01", "2020-04-01", "SD", "2", "2019-12-12", "2020-01-01", "SD", "2", "2019-12-12", "2020-02-01", "PR", "2", "2019-12-12", "2020-03-01", "SD", "2", "2019-12-12", "2020-03-13", "CR", "4", "2019-12-30", "2020-01-01", "PR", "4", "2019-12-30", "2020-03-01", "NE", "4", "2019-12-30", "2020-04-01", "NE", "4", "2019-12-30", "2020-05-01", "PR", "5", "2020-01-01", "2020-01-01", "PR", "5", "2020-01-01", "2020-01-10", "PR", "5", "2020-01-01", "2020-01-20", "PR", "6", "2020-02-02", "2020-02-06", "PR", "6", "2020-02-02", "2020-02-16", "CR", "6", "2020-02-02", "2020-03-30", "PR", "7", "2020-02-02", "2020-02-06", "PR", "7", "2020-02-02", "2020-02-16", "CR", "7", "2020-02-02", "2020-04-01", "NE", "8", "2020-02-01", "2020-02-16", "PD") %>% dplyr::mutate( ADT = lubridate::ymd(ADTC), TRTSDT = lubridate::ymd(TRTSDTC), PARAMCD = "OVR", PARAM = "Overall Response by Investigator" ) %>% dplyr::select(-TRTSDTC)`

Suppose that we want to calculate the BOR without confirmation and the SD duration is set to 4 weeks, only we simply need to specify `ref_start_window = 28`

.

`derive_bor(data = adrs, ref_start_window = 28)## # A tibble: 7 × 8## USUBJID ADTC AVALC ADT TRTSDT PARAMCD PARAM AVAL## <chr> <chr> <chr> <date> <date> <chr> <chr> <dbl>## 1 1 2020-02-01 CR 2020-02-01 2020-01-01 BOR Best Overall Response 1## 2 2 2020-03-13 CR 2020-03-13 2019-12-12 BOR Best Overall Response 1## 3 4 2020-01-01 PR 2020-01-01 2019-12-30 BOR Best Overall Response 2## 4 5 2020-01-01 PR 2020-01-01 2020-01-01 BOR Best Overall Response 2## 5 6 2020-02-16 CR 2020-02-16 2020-02-02 BOR Best Overall Response 1## 6 7 2020-02-16 CR 2020-02-16 2020-02-02 BOR Best Overall Response 1## 7 8 2020-02-16 PD 2020-02-16 2020-02-01 BOR Best Overall Response 4`

Suppose that we want to calculate the BOR with confirmation and the SD duration is set to 4 weeks, and the interval of two responses is set to 28 days, we simply need to add `ref_interval = 28`

and `confirm = TRUE`

`derive_bor(data = adrs, ref_start_window = 28, ref_interval = 28, confirm = TRUE)## # A tibble: 7 × 8## USUBJID ADTC AVALC ADT TRTSDT PARAMCD PARAM AVAL## <chr> <chr> <chr> <date> <date> <chr> <chr> <dbl>## 1 1 2020-02-01 CR 2020-02-01 2020-01-01 CBOR Confirmed Best Overall Response 1## 2 2 2020-02-01 SD 2020-02-01 2019-12-12 CBOR Confirmed Best Overall Response 3## 3 4 2020-05-01 SD 2020-05-01 2019-12-30 CBOR Confirmed Best Overall Response 3## 4 5 2020-01-01 NE 2020-01-01 2020-01-01 CBOR Confirmed Best Overall Response 5## 5 6 2020-02-06 PR 2020-02-06 2020-02-02 CBOR Confirmed Best Overall Response 2## 6 7 2020-02-06 NE 2020-02-06 2020-02-02 CBOR Confirmed Best Overall Response 5## 7 8 2020-02-16 PD 2020-02-16 2020-02-01 CBOR Confirmed Best Overall Response 4`

If we don't want any NE between the response and confirmatory response in addition to the above conditions, we can simply add `max_ne = 0`

.

`derive_bor(data = adrs, ref_start_window = 28, ref_interval = 28, confirm = TRUE, max_ne = 0)## # A tibble: 7 × 8## USUBJID ADTC AVALC ADT TRTSDT PARAMCD PARAM AVAL## <chr> <chr> <chr> <date> <date> <chr> <chr> <dbl>## 1 1 2020-01-01 PR 2020-01-01 2020-01-01 CBOR Confirmed Best Overall Response 2## 2 2 2020-02-01 SD 2020-02-01 2019-12-12 CBOR Confirmed Best Overall Response 3## 3 4 2020-05-01 SD 2020-05-01 2019-12-30 CBOR Confirmed Best Overall Response 3## 4 5 2020-01-01 NE 2020-01-01 2020-01-01 CBOR Confirmed Best Overall Response 5## 5 6 2020-02-06 PR 2020-02-06 2020-02-02 CBOR Confirmed Best Overall Response 2## 6 7 2020-02-06 NE 2020-02-06 2020-02-02 CBOR Confirmed Best Overall Response 5## 7 8 2020-02-16 PD 2020-02-16 2020-02-01 CBOR Confirmed Best Overall Response 4`

The above all are my summries for BOR calculation. If there is any problem or error, please email me to let me know, or leave your issues in the https://github.com/kaigu1990/stabiot/issues.

At the very least, I'd like to appreciate the `admiral`

R package, I have learned more programming skills for BOR calculation from `admiral::derive_extreme_event()`

.

https://github.com/pharmaverse/admiral

https://www.pharmasug.org/proceedings/2023/QT/PharmaSUG-2023-QT-047.pdf

https://www.pharmasug.org/proceedings/2020/DV/PharmaSUG-2020-DV-066.pdf

https://ctep.cancer.gov/protocolDevelopment/docs/recist_guideline.pdf

https://www.lexjansen.com/pharmasug-cn/2021/SR/Pharmasug-China-2021-SR038.pdf

The advantage of the ORR is that it can be assessed earlier than PFS/OS, and in smaller samples. In general, we will assume that the response rate follows the binomial distribution, so naturally we will consider the ORR as a binomial response rate, and the Clopper-Pearson method is frequently used to estimate the two-sided 95% confidence interval (CI). If you would like to control the confounding factors in the stratified study design, the Cochran-Mantel-Haenszel (CMH) test provides a solution to address these needs.

How about the odds ratio (OR)? It is a measure of the association between an exposure and an outcome. So it can be regarded as the odds of the outcome occurring in a particular exposure compared to the odds in the absence of that exposure. Thus, we can use it to assess the ORR between the treatment and control groups in RCT trials in combination with a 95% binomial response rate as presented in reports. More details can be found in Explaining Odds Ratios.

Firstly, let's see how to use `proc freq`

in SAS to obtain the ORR rate with Clopper-Pearson (Exact) CI and the odds ratio with and without stratification. Imaging we have an example of data with columns like TRTPN(1/2), ORR(1 for subjects with ORR and 0 without ORR), Strata1(A/B) and the count number.

`data dat; input TRTPN ORR Strata1 $ Count @@; datalines;1 1 A 8 1 1 B 121 2 A 17 1 2 B 132 1 A 13 2 1 B 92 2 A 20 2 2 B 8;run;`

Then use the `tables`

statement with `binomial`

to compute the CI of ORR. The `level="1"`

binomial option can help you compute the proportion for subjects with events, which means the CI corresponds to the ORR event. And the `exact biomial`

can compute the Clopper-Pearson CI as you need.

`ods listing close;proc freq data=dat; by trtpn; weight count/zeros; tables orr/binomial(level="1") alpha=0.05; exact binomial; ods output binomial=orrci;run;ods listing;`

Before the stratification analysis, let's see the common odds ratio without any stratified factors. The option `chisq`

requests chi-square tests and measurements, and `relrisk`

displays the odds ratio and relative risk with asymptotic Wald CI by default.

`ods listing close;proc freq data=dat; weight count/zeros; tables TRTPN*ORR /chisq relrisk; ods output FishersExact=pval RelativeRisks=ci;run;ods listing;`

And then let's see how to use `CMH`

as the statistical method in `proc freq`

to obtain the association statistics, p-value of Cochran-Mantel-Haenszel test, adjusted odds ratio by Strata1 variable and corresponding CI.

`ods listing close;proc freq data=dat; weight count/zeros; tables Strata1*TRTPN*ORR /cmh; ods output cmh=cmhpval CommonRelRisks=cmhci; run;`

Now that we have seen the example of the `proc freq`

used to compute the odds ratio with and without stratification, let's have a look at how to use the logistic regression `proc logistic`

to do it.

`proc logistic data=dat; weight count; class TRTPN / param=ref ref=last; model ORR(event='1')=TRTPN;run;`

And the stratification analysis by logistic as shown below.

`proc logistic data=dat; freq count; class TRTPN Strata1 / param=ref ref=last; strata Strata1; model ORR(event='1')=TRTPN;run;`

However, we can see there is a little difference between `proc freq`

and the logistic regression method of odds ratio. The same condition occurs in R as well.

Now let's jump into the R section, how can we handle the same analysis in R?

First of all, I want to recommend the `tern`

R package, which focuses on clinical statistical analysis and provides serveral helpful functions. More details can be found in the tern package document.

I create an example data set similar to the one shown above, which includes the same columns but is not the counted table. The columns of strata1 - strata3 represent three stratified factors.

`set.seed(12)dta <- data.frame( orr = sample(c(1, 0), 100, TRUE), trtpn = factor(rep(c(1, 2), each = 50), levels = c(2, 1)), strata1 = factor(sample(c("A", "B"), 100, TRUE)), strata2 = factor(sample(c("C", "D"), 100, TRUE)), strata3 = factor(sample(c("E", "F"), 100, TRUE)))`

Then you can use `BinomCI`

function to compute the CI of ORR and `BinomDiffCI`

function to compute the CI of difference ORR in two treatments.

`dta %>% count(trtpn, orr)## trtpn orr n## 1 2 0 28## 2 2 1 22## 3 1 0 30## 4 1 1 20 DescTools::BinomCI(x = 20, n = 50, method = "clopper-pearson")## est lwr.ci upr.ci## [1,] 0.4 0.2640784 0.548206DescTools::BinomDiffCI(20, 50, 22, 50, method=c("wald"))## est lwr.ci upr.ci## [1,] -0.04 -0.2333125 0.1533125`

Regarding the unstratification analysis of odds ratio, we can use `DescTools::OddsRatio()`

function, or logistic regression using `glm()`

with `logit`

link. Below is the code to get the odds ratio and corresponding Wald CI using `OddsRatio()`

function.

`DescTools::OddsRatio(matrix(c(20, 22, 30, 28), nrow = 2, byrow = TRUE), method = "wald", conf.level = 0.95)## odds ratio lwr.ci upr.ci ## 0.8484848 0.3831831 1.8788054 `

And the `glm()`

function also can get the same results as shwon below.

`fit <- glm(orr ~ trtpn, data = dta, family = binomial(link = "logit"))exp(cbind(Odds_Ratio = coef(fit), confint(fit)))## Odds_Ratio 2.5 % 97.5 %## (Intercept) 0.7857143 0.4450719 1.369724## trtpn1 0.8484848 0.3811997 1.879735`

Regarding the unstratification analysis of odds ratio, there are two ways that I have found for computing it. One is Cochran-Mantel-Haenszel chi-squared test using `mantelhaen.test()`

function, and another is conditional logistic regression `survival::clogit()`

function with `strata`

usage for stratification analysis. Let's have a look at the specific steps.

Assuming that we want to consider three stratified factors in our CMH test, we'd better to pre-process data properly before we pass on to `mantelhaen.test`

function. Because this function has certain requirement for the input data format.

`# pre-processdf <- dta %>% count(trtpn, orr, strata1, strata2, strata3)tab <- xtabs(n ~ trtpn + orr + strata1 + strata2 + strata3, data = df)tb <- as.table(array(c(tab), dim = c(2, 2, 2 * 2 * 2)))# CMH analysismantelhaen.test(tb, correct = FALSE)## Mantel-Haenszel chi-squared test without continuity correction## data: tb## Mantel-Haenszel X-squared = 0.40574, df = 1, p-value = 0.5241## alternative hypothesis: true common odds ratio is not equal to 1## 95 percent confidence interval:## 0.3376522 1.7320849## sample estimates:## common odds ratio ## 0.7647498`

PS. If we only use one stratification like `strata1`

, the same result as SAS `proc freq`

we can get here. Besides you can also use `vcdExtra::CMHtest`

to compute the p-value of CMH, but if you want to obtain the same p-value used in SAS, a modification has to be made to the vcdExtra library. Refer to this github issue: https://github.com/friendly/vcdExtra/issues/3.

And then how to implement it using conditional logistic regression, just add the `strata`

in the formula.

`library(survival)fit <- clogit(formula = orr ~ trtpn + strata(strata1, strata2, strata3), data = dta)exp(cbind(Odds_Ratio = coef(fit), confint(fit)))## Odds_Ratio 2.5 % 97.5 %## trtpn1 0.7592608 0.335024 1.720704`

Above all, here is my brief summary for the statisical analysis of ORR and odds ratio in R and SAS. And CMH is also a widely used method to test the association between treatment and binary outcome when you want to consider the stratification factors. Lastly, a question remain unanswered: why do we obtain different results from the logistic regression compared to the CMH test when we apply them to compute the the stratified odds ratio. I'm looking for how to respond to it.

Introduction to tern Calculation of Cochran–Mantel–Haenszel Statistics for Objective Response and Clinical Benefit Rates and the Effects of Stratification Factors

Estimating Binomial Proportion Confidence Interval with Zero Frequency Response using FREQ Procedure

The path less trodden - PROC FREQ for ODDS RATIO

R: How to Calculate Odds Ratios in Logistic Regression Model

`CAMIS`

github asking how to do the hypothesis testing of MMRM in R, especially in non-inferiority or superiority trials. And then I received a reminder that I can get the manual from `mmrm`

package document.The `mmrm`

package has provided the `df_1d()`

function to do the one-dimensional contrast. So let's start by fitting a mmrm model first with `us`

(unstructured) covariance structure and Kenward-Roger adjustment methods. I also include a linear Kenward-Roger approximation for coefficient covariance matrix adjustment so that R results can be compared with SAS when the unstructured covariance model is selected.

`library(mmrm)fit <- mmrm( formula = FEV1 ~ RACE + SEX + ARMCD * AVISIT + us(AVISIT | USUBJID), reml = TRUE, method = "Kenward-Roger", vcov = "Kenward-Roger-Linear", data = fev_data)summary(fit)`

Assuming that we aim to compare Race white with Race Asian, the results are as follows.

`contrast <- numeric(length(component(fit, "beta_est")))contrast[3] <- 1df_1d(fit, contrast)# same as # emmeans(fit, ~ RACE) %>% contrast() %>% test()`

Honestly, I prefer to use the `emmeans`

package to compute estimated marginal means (least-square means), especially when you also want to compute it by visit and by treatment. Because `mmrm`

package sets an object interface so that it can be used for the `emmeans`

package. And `emmeans`

has also built a set of useful functions to deal with common questions. So it’s a good solution to fit the MMRM model by `mmrm`

and do hypothesis testing by `emmeans`

.

A general assumption is that we would like to compute the least-square means first for the coefficients of the MMRM by visit and by treatment. This can be done through `emmeans()`

and `confint()`

functions.

`library(emmeans)ems <- emmeans(fit, ~ ARMCD | AVISIT)confint(ems)## AVISIT = VIS1:## ARMCD emmean SE df lower.CL upper.CL## PBO 33.3 0.761 148 31.8 34.8## TRT 37.1 0.767 143 35.6 38.6## ## AVISIT = VIS2:## ARMCD emmean SE df lower.CL upper.CL## PBO 38.2 0.616 147 37.0 39.4## TRT 41.9 0.605 143 40.7 43.1## ## AVISIT = VIS3:## ARMCD emmean SE df lower.CL upper.CL## PBO 43.7 0.465 130 42.8 44.6## TRT 46.8 0.513 130 45.7 47.8## ## AVISIT = VIS4:## ARMCD emmean SE df lower.CL upper.CL## PBO 48.4 1.199 134 46.0 50.8## TRT 52.8 1.196 133 50.4 55.1## ## Results are averaged over the levels of: RACE, SEX ## Confidence level used: 0.95 `

Naturally we will also want to consider the contrast to see what is the difference between treatment and placebo where the null hypothesis is that treatment minus placebo equals zero. Here the `contrast()`

function will be run. If you want to see the confidence interval of difference, just use `confint(contr)`

that will be fine. PS. You can relevel the order of `ARMCD`

factor in advance, in that case the `method=pairwise`

can reach the same results as well.

`contr <- contrast(ems, adjust = "none", method = "revpairwise")contr## AVISIT = VIS1:## contrast estimate SE df t.ratio p.value## TRT - PBO 3.77 1.082 146 3.489 0.0006## ## AVISIT = VIS2:## contrast estimate SE df t.ratio p.value## TRT - PBO 3.73 0.863 145 4.323 <.0001## ## AVISIT = VIS3:## contrast estimate SE df t.ratio p.value## TRT - PBO 3.08 0.696 131 4.429 <.0001## ## AVISIT = VIS4:## contrast estimate SE df t.ratio p.value## TRT - PBO 4.40 1.693 133 2.597 0.0104## ## Results are averaged over the levels of: RACE, SEX`

Besides maybe we would like to further assess whether treatment is superior to placebo with a margin of `2`

. You can utilize the `test()`

function with the `null = 2`

argument.

`test(contr, null = 2, side = ">")## AVISIT = VIS1:## contrast estimate SE df null t.ratio p.value## TRT - PBO 3.77 1.082 146 2 1.640 0.0516## ## AVISIT = VIS2:## contrast estimate SE df null t.ratio p.value## TRT - PBO 3.73 0.863 145 2 2.007 0.0233## ## AVISIT = VIS3:## contrast estimate SE df null t.ratio p.value## TRT - PBO 3.08 0.696 131 2 1.554 0.0613## ## AVISIT = VIS4:## contrast estimate SE df null t.ratio p.value## TRT - PBO 4.40 1.693 133 2 1.416 0.0795## ## Results are averaged over the levels of: RACE, SEX ## P values are right-tailed`

In general the common estimations and hypothesis testing of MMRM are all here, which at least I have encountered. In the next step, I want to compare the above results with SAS to see if it can be regarded as additional QC validation. We use the `lsmeans`

statement to estimate least-square means and do superiority testing at visit 4 through `lsmestimate`

statement.

`proc mixed data=fev_data; class ARMCD(ref='PBO') AVISIT RACE SEX USUBJID; model FEV1 = RACE SEX ARMCD ARMCD*AVISIT / ddfm=KR; repeated AVISIT / subject=USUBJID type=UN r rcorr; lsmeans ARMCD*AVISIT / cl alpha=0.05 diff slice=AVISIT; lsmeans ARMCD / cl alpha=0.05 diff; lsmestimate ARMCD*AVISIT [1,1 4] [-1,2 4] / cl upper alpha=0.025 testvalue=2; ods output lsmeans=lsm diffs=diff LSMEstimates=est;run;`

All of the results can be shown below, and they are consistent with the R results.

`emmeans`

package that will calculate the estimated mean value for different factor variables and assume the mean value for continuous variables.In addition, `emmeans`

also contains a set of functions not limited for contrasts and hypothesis testing that are commonly used in clinical trial statistical analysis, such as ANCOVA and MMRM. So the goal of this article is not only to know how to use the `emmeans`

package to answer these questions but also to learn several separate steps for each of them.

Let's start by fitting a model first. Given that I have an example of data `fev_data`

from `mmrm`

package, and then fit a simple ANCOVA model by `lm`

function.

`library(tidyverse)library(tidymodels)library(mmrm)library(emmeans)fit <- fev_data %>% filter(AVISIT == "VIS4" & !is.na(FEV1)) %>% lm(formula = FEV1 ~ ARMCD)tidy(fit)## # A tibble: 2 × 5## term estimate std.error statistic p.value## <chr> <dbl> <dbl> <dbl> <dbl>## 1 (Intercept) 47.8 1.23 38.8 4.73e-74## 2 ARMCDTRT 4.83 1.74 2.77 6.33e- 3`

I have known how to compute the LS means, but here we can learn the process of how the `SE`

is calculated. And more details can be found in the link of https://bookdown.org/dereksonderegger/571/4-contrasts.html.

So we can calculate the `LS mean`

estimate, `SE`

and corresponding confidence interval as shown below. Let's try to focus on the `TRT`

group.

`X <- model.matrix(fit)sigma.hat <- glance(fit) %>% pull(sigma)beta.hat <- tidy(fit) %>% pull(estimate)XtX.inv <- solve(t(X) %*% X)# contrast for TRT ARMcont <- c(1, 1)est <- t(cont) %*% beta.hatstd.err <- sigma.hat * sqrt(t(cont) %*% XtX.inv %*% cont)df <- glance(fit) %>% pull(df.residual)q <- qt(1 - 0.05 / 2, df)ci <- c(est) + c(-1, 1) * q * c(std.err)setNames(c(est, std.err, df, ci), c("est", "SE", "df", "lower.ci", "upper.ci"))## est SE df lower.ci upper.ci ## 52.592798 1.230847 132.000000 50.158062 55.027535`

The same results can be obtained by calling the `emmeans()`

function.

`ems <- emmeans(fit, ~ARMCD)ems## ARMCD emmean SE df lower.CL upper.CL## PBO 47.8 1.23 132 45.3 50.2## TRT 52.6 1.23 132 50.2 55.0`

Next, we will create a contrast that is a linear combination of the means. In the following example, the contrast may answer the question of weather the treatment (`TRT`

) produces a significant effect than placebo, like `contrast=TRT-PRB`

.

`# TRT vs. PBO: TRT - PBOk <- c(-1, 1)`

And then compute the estimation and standard error for the contrast. The basic principle of `SE`

is that the variance of a linear combination of independent estimates is equal to the linear combination of their variances.

`est <- tidy(ems) %>% pull(estimate)se <- tidy(ems) %>% pull(std.error)con_est <- con %*% estcon_se <- sqrt(con^2 %*% se^2)`

Since we have got the estimation(`con_est`

) and standard error(`con_se`

) above, naturally we can use them to compute the confidence interval and p value following the t distribution with assuming the null hypothesis is `TRT - PBO = 0`

`df <- tidy(ems) %>% pull(df) %>% unique()t <- con_est[1, 1] / con_se[1, 1]q <- qt(1 - 0.05 / 2, df)ci <- con_est[1, 1] + c(-1, 1) * q * c(con_se)pval <- 2 * pt(t, df, lower.tail = FALSE)tibble( est = con_est[1,1], se = con_se[1,1], df = df, t = t, lower.ci = ci[1], upper.ci = ci[2], pval = pval)## # A tibble: 1 × 7## est se df t lower.ci upper.ci pval## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>## 1 4.83 1.74 132 2.77 1.39 8.27 0.00633`

We can also obtain the same results from the below code straightforwardly.

`contr <- contrast(ems, method = list(k), adjust = "none")contr## contrast estimate SE df t.ratio p.value## c(-1, 1) 4.83 1.74 132 2.774 0.0063confint(contr)## contrast estimate SE df lower.CL upper.CL## c(-1, 1) 4.83 1.74 132 1.39 8.27## Confidence level used: 0.95 `

Suppose that you want to do the hypothesis test that treatment is superior to placebo with a margin of `2`

, just add a small change to the t statistic.

`t2 <- (con_est[1, 1] - 2) / con_se[1, 1]pval <- pt(t2, df, lower.tail = FALSE)pval## [1] 0.05322456`

The same process can be implemented by the `emmeans::test()`

function.

`test(contr, null = 2, side = ">", adjust = "none")## contrast estimate SE df null t.ratio p.value## c(-1, 1) 4.83 1.74 132 2 1.625 0.0532## P values are right-tailed `

The above is only a simple example of a 1-way ANOVA, so that I can learn and understand it clearly. Actually, other complicated models and contrasts can be processed as well. Through these step by step computations, we can gain deeper thoughts of why we chose the functions, and what's the nature of our computation.

Chapter 4 Contrasts

Chapter 9 Contrasts and multiple comparison testing

Chapter 6 Beginning to Explore the emmeans package for post hoc tests and contrasts

My notes on using {emmeans}

Confidence intervals and tests in emmeans

由于我是采用镜像的方式迁移，因此流程非常简单；在开始更换服务器之前，只需要做好以下准备工作：

- 对当前服务器设置一个自定义的镜像
- 将原有的Hexo站点文件备份，以防数据丢失
- 选配新的ECS服务器，其中地域选择与旧ECS相同的、镜像选择你设置的自定义好的，这样后续才能顺利迁移

当你完成新ECS服务器购买后，可以开始进行服务器更换了，按照以下步骤进行：

- 当你完成ECS服务器购买后，新ECS已有了与旧ECS相同的配置，所以几乎不需要再重新配置hexo了，除了将
`_config.yml`

文件中旧IP替换成新IP - 检查下各个端口是否打开，防火墙是否配置
- 重新deploy下博客文章
- 最后重新解析下域名，将旧IP更换成新的IP；不然网站只能用IP访问而不能用www域名访问了

通过以上步骤，即完成了ECS服务器更换的Hexo迁移

]]>As we know, the common primary outcome in randomized trials is often the difference in average (LS mean) at a given timepoint (visit). One way to analyze these data is to ignore the measurements at intermediate timepoints and focus on estimating the outcome at the specific timepoint by ANCOVA, but the data should be complete. If not, sometimes the multiple imputation method is suggested. However in the MMRM model, it's generally thought that utilizing the information from all timepoints implicitly handles missing data. In SAS, it's more efficient to use `proc mixed`

than `proc glm`

to handle missing values, which allows the inclusion of subjects with missing data. And in R, I feel like the 'mmrm' package is more powerful and runs more smoothly than others.

Here, I take the example data from `mmrm`

package and implement the MMRM using SAS and R, respectively. In this randomized trial, subjects are treated with a treatment drug or placebo, and the FEV1 (forced expired volume in one second) is a measure of how quickly the lungs can be emptied. This measure is repeated from Visit 1 to Visit 4. Low levels of FEV1 may indicate chronic obstructive pulmonary disease (COPD). To evaluate the effect of treatment on FEV1, the MMRM will be used to analyze the outcome with an unstructured covariance matrix reflecting the correlation between visits within the subjects, treatment (treatment drug or placebo), visit and treatment-by-visit as the fixed effects, subject as a randon effect, visit as a repeated measure, and baseline as the covariates.

Here, I take the example data from `mmrm`

package and implement the MMRM using SAS and R, respectively. In this randomized trial, subjects are treated with a treatment drug or placebo, and the FEV1 (forced expired volume in one second) is a measure of how quickly the lungs can be emptied. This measure is repeated from Visit 1 to Visit 4. Low levels of FEV1 may indicate chronic obstructive pulmonary disease (COPD).

`library(mmrm)data("fev_data")write.csv(fev_data, file = "./fev_data.csv", na = "", row.names = F)`

To evaluate the effect of treatment on FEV1, this endpoint measurements can be analyzed using MMRM with an unstructured covariance matrix reflecting the correlation between visits within the subjects, treatment (treatment drug or placebo), visit and treatment-by-visit as the fixed effects, subject as a randon effect, visit as a repeated measure, and race as the covariate.

So the SAS code as shown below.

`proc import datafile="./fev_data.csv" out=fev_data dbms=csv replace; getnames=yes;run;proc mixed data=fev_data method=reml; class ARMCD(ref='PBO') AVISIT RACE USUBJID; model FEV1 = RACE ARMCD AVISIT ARMCD*AVISIT / ddfm=KR; repeated AVISIT / subject=USUBJID type=UN r rcorr; lsmeans ARMCD*AVISIT / cl alpha=0.05 diff slice=AVISIT; lsmeans ARMCD / cl alpha=0.05 diff; ods output lsmeans=lsm diffs=diff;run;`

From above SAS code, we can see that the `method`

option specifies the estimation method as `REML`

. The `repeated`

statement is used to specify the repeated measures factor and control the covariance structure. In the repeated measures models, the `subject`

optional is used to define which observations belong to the same subject, and which belong to the different subjects who are assumed to be independent. The `type`

optional statement specifies the model for the covariance structure of the error within subjects. We also add `ddfm=KR`

in `model`

statement to specify a method for the denominator degrees of freedom (such as Kenward-Rogers here). At least, the LS mean calculated from the `lsmeans`

statement with `ci`

and `diff`

options is also very commonly used. These two options can help us obtain the confidence interval and difference of the LS mean, and the p value if the hypothesis margin is `0`

.

As for `ARMCD*AVISIT`

in the `lsmeans`

statement that means you would like to get the test of LS means in all combinations of visits. If you try the `lsmeans ARMCD`

, which is identical to the mean of pair-wise visits from the LS means of `lsmeans ARMCD*AVISIT`

.

And the same arguments in R, as shown below.

`library(mmrm)library(emmeans)data("fev_data")fit <- mmrm( formula = FEV1 ~ RACE + ARMCD + AVISIT + ARMCD * AVISIT + us(AVISIT | USUBJID), data = fev_data)# summary(fit)`

If you would like to obtain the LS mean of each visit for each group, like the `lsm`

dataset in SAS, you can use the `emmeans`

function from the `emmeans`

package as the mmrm object can be analyzed by the external package.

`# emmeans(fit, "ARMCD", by = "AVISIT")emmeans(fit, ~ ARMCD | AVISIT)## AVISIT = VIS1:## ARMCD emmean SE df lower.CL upper.CL## PBO 33.3 0.757 149 31.8 34.8## TRT 37.1 0.764 144 35.6 38.6## ## AVISIT = VIS2:## ARMCD emmean SE df lower.CL upper.CL## PBO 38.2 0.608 150 37.0 39.4## TRT 41.9 0.598 146 40.7 43.1## ## AVISIT = VIS3:## ARMCD emmean SE df lower.CL upper.CL## PBO 43.7 0.462 131 42.8 44.6## TRT 46.8 0.507 130 45.8 47.8## ## AVISIT = VIS4:## ARMCD emmean SE df lower.CL upper.CL## PBO 48.4 1.189 134 46.0 50.7## TRT 52.8 1.188 133 50.4 55.1## ## Results are averaged over the levels of: RACE ## Confidence level used: 0.95`

As for the `diff`

dataset from SAS, you can use the `pairs`

function to get identical outputs.

`pairs(emmeans(fit, ~ ARMCD | AVISIT), reverse = TRUE, adjust="tukey")## AVISIT = VIS1:## contrast estimate SE df t.ratio p.value## TRT - PBO 3.78 1.076 146 3.508 0.0006## ## AVISIT = VIS2:## contrast estimate SE df t.ratio p.value## TRT - PBO 3.76 0.853 148 4.405 <.0001## ## AVISIT = VIS3:## contrast estimate SE df t.ratio p.value## TRT - PBO 3.11 0.689 132 4.509 <.0001## ## AVISIT = VIS4:## contrast estimate SE df t.ratio p.value## TRT - PBO 4.41 1.681 133 2.622 0.0098## ## Results are averaged over the levels of: RACE`

I feel like if we use the ANCOVA model and focus on the specific timepoint before the end of the trial, in that case, we can say the treatment effect is the main difference between the treatment and control groups. But in MMRM, we include all timepoints's information. Despite the collection of these intermediate outcomes, the primary outcome is often still the difference at that specific or final timepoint. Thus, it will have a couple of advantages, like improving the power and avoiding the bias of dropout because although the subjects withdraw from the study before the final timepoint, they may still contribute information in the interim. Once all the timepoints are included, the treatment-by-visit also should be added to the model as a consideration when the effect is different in the slopes of outcomes over time.

Initially, the unstructured (`type=UN`

) covariance structure allows SAS to estimate the covariance matrix, as the unstructured approach makes no assumption at all about the relationship in the correlations among study visits. As for how to select an appropriate covariance structure, it depends on your understanding of the study and the data you have. Here are also a couple of documents for your reference if you would like to know which structure can be used and how to try and select a more suitable structure. For instance, the lower AIC values suggest a better fit.

Here are two documents for your reference: - Selecting an Appropriate Covariance Structure - Guidelines for Selecting the Covariance Structure in Mixed Model Analysis

- MMRM Package Introduction
- MIXED MODEL REPEATED MEASURES (MMRM)
- Proc mixed
- Understanding Interaction Effects in Statistics
- Mixed Model Repeated Measures (MMRM)
- Repeated Measures Modeling With PROC MIXED
- Mixed Models for Repeated Measures Should Include Time-by-Covariate Interactions to Assure Power Gains and Robustness Against Dropout Bias Relative to Complete-Case ANCOVA

`mcradds`

(version 1.0.1) helps with designing, analyzing and visualization in In Vitro Diagnostic trials.You can install it from CRAN with:

`install.packages("mcradds")`

or you can install the development version directly from GitHub with:

`if (!require("devtools")) { install.packages("devtools")}devtools::install_github("kaigu1990/mcradds")`

This blog post will introduce you to package and desirability functions. Let's start loading this package.

`library(mcradds)`

The `mcradds`

R package is a complement to `mcr`

package and it offers common and solid functions for designing, analyzing, and visualizing in In Vitro Diagnostic (IVD) trials. In my work experience as a statistician for diagnostic trials at Roche Diagnostic, `mcr`

package is an internally built tool for analyzing regression and other relevant methodologies that are also widely used in the IVD industry community.

However, the `mcr`

package focuses on method comparison trials and does not include additional common diagnostic methods but that have been provided in the `mcradds`

. It is intuitive and easy to use. So you can perform statistical analysis and graphics in different IVD trials utilizing the analytical functions.

- Estimate the sample size for trials, following NMPA guidelines.
- Evaluate diagnostic accuracy with/without reference, following CLSI EP12-A2.
- Perform regression method analysis and plots, following CLSI EP09-A3.
- Perform bland-Altman analysis and plots, following CLSI EP09-A3.
- Detect outliers with 4E method from CLSI EP09-A2 and ESD from CLSI EP09-A3.
- Estimate bias in medical decision level, following CLSI EP09-A3.
- Perform Pearson and Spearman correlation analysis, adding hypothesis test and confidence interval.
- Evaluate Reference Range/Interval, following CLSI EP28-A3 and NMPA guidelines.
- Add paired ROC/AUC test for superiority and non-inferiority trials, following CLSI EP05-A3/EP15-A3.
- Perform reproducibility analysis (reader precision) for immunohistochemical assays, following CLSI I/LA28-A2 and NMPA guidelines.
- Evaluate precision of quantitative measurements, following CLSI EP05-A3.

Please be noted that these functions and methods have not been validated and QC'ed, so I cannot guarantee that all of them are entirely proper and error-free. But I always strive to compare the results to those of other resources in order to obtain a consistent result for them. And because some of them were utilized in my past usual work process, I believe the quality of this package is temporarily sufficient to use.

Let's demonstrate that by looking at a few of examples. More detailed usages can be found in Get started page

Suppose that we have a new diagnostic assay with the expected sensitivity criteria of `0.9`

, and the clinical acceptable criteria is `0.85`

. If we conduct a two-sided normal Z-test at a significance level of `α = 0.05`

and achieve a power of `80%`

, what should the total sample size be?

The result from sample size function is:

`size_one_prop(p1 = 0.9, p0 = 0.85, alpha = 0.05, power = 0.8)#> #> Sample size determination for one Proportion #> #> Call: size_one_prop(p1 = 0.9, p0 = 0.85, alpha = 0.05, power = 0.8)#> #> optimal sample size: n = 363 #> #> p1:0.9 p0:0.85 alpha:0.05 power:0.8 alternative:two.sided`

Suppose that you have a wide structure of data like `qualData`

that contains the qualitative measurements of the candidate (your own product) and comparative (reference product) assays. In this scenario, if you’re interested in how to create a 2x2 contingency table, the `diagTab()`

function is a good solution.

`data("qualData")tb <- qualData %>% diagTab( formula = ~ CandidateN + ComparativeN, levels = c(1, 0) )tb#> Contingency Table: #> #> levels: 1 0#> ComparativeN#> CandidateN 1 0#> 1 122 8#> 0 16 54`

However, there are different formula settings when the data structure is long.

`dummy <- data.frame( id = c("1001", "1001", "1002", "1002", "1003", "1003"), value = c(1, 0, 0, 0, 1, 1), type = c("Test", "Ref", "Test", "Ref", "Test", "Ref")) %>% diagTab( formula = type ~ value, bysort = "id", dimname = c("Test", "Ref"), levels = c(1, 0) )dummy#> Contingency Table: #> #> levels: 1 0#> Ref#> Test 1 0#> 1 1 1#> 0 0 1`

And then you can use the `getAccuracy()`

method to compute the diagnostic performance based on the table above.

`# Default method is Wilson score, and digit is 4.tb %>% getAccuracy(ref = "r")#> EST LowerCI UpperCI#> sens 0.8841 0.8200 0.9274#> spec 0.8710 0.7655 0.9331#> ppv 0.9385 0.8833 0.9685#> npv 0.7714 0.6605 0.8541#> plr 6.8514 3.5785 13.1181#> nlr 0.1331 0.0832 0.2131`

If you want to estimate the reader precision between different readers, reads, or sites, use the `APA`

, `ANA`

and `OPA`

as the primary endpoint in the PDL1 assay trials. Let’s see an example of precision between readers.

`data("PDL1RP")reader <- PDL1RP$btw_readertb1 <- reader %>% diagTab( formula = Reader ~ Value, bysort = "Sample", levels = c("Positive", "Negative"), rep = TRUE, across = "Site" )getAccuracy(tb1, ref = "bnr", rng.seed = 12306)#> EST LowerCI UpperCI#> apa 0.9479 0.9260 0.9686#> ana 0.9540 0.9342 0.9730#> opa 0.9511 0.9311 0.9711`

Suppose that in another scenario, you have a wide structure of quantitative data like `platelet`

and would like to do the Bland-Altman analysis to obtain a series of descriptive statistics including, `mean`

, `median`

, `Q1`

, `Q3`

, `min`

, `max`

and other estimations like `CI`

(confidence interval of mean) and `LoA`

(Limit of Agreement).

`data("platelet")# Default difference typeblandAltman( x = platelet$Comparative, y = platelet$Candidate, type1 = 3, type2 = 5)#> Call: blandAltman(x = platelet$Comparative, y = platelet$Candidate, #> type1 = 3, type2 = 5)#> #> Absolute difference type: Y-X#> Relative difference type: (Y-X)/(0.5*(X+Y))#> #> Absolute.difference Relative.difference#> N 120 120#> Mean (SD) 7.330 (15.990) 0.064 ( 0.145)#> Median 6.350 0.055#> Q1, Q3 ( 0.150, 15.750) ( 0.001, 0.118)#> Min, Max (-47.800, 42.100) (-0.412, 0.667)#> Limit of Agreement (-24.011, 38.671) (-0.220, 0.347)#> Confidence Interval of Mean ( 4.469, 10.191) ( 0.038, 0.089)`

And the visualization of Bland-Altman can be easily conducted by the `autoplot`

method.

`object <- blandAltman(x = platelet$Comparative, y = platelet$Candidate)# Absolute difference plotautoplot(object, type = "absolute")`

Here is a plot of the data.

Based on the output from Bland-Altman, you can also detect the potential outliers using the `getOutlier()`

method.

`# ESD approachba <- blandAltman(x = platelet$Comparative, y = platelet$Candidate)out <- getOutlier(ba, method = "ESD", difference = "rel")out$stat#> i Mean SD x Obs ESDi Lambda Outlier#> 1 1 0.06356753 0.1447540 0.6666667 1 4.166372 3.445148 TRUE#> 2 2 0.05849947 0.1342496 0.5783972 4 3.872621 3.442394 TRUE#> 3 3 0.05409356 0.1258857 0.5321101 2 3.797226 3.439611 TRUE#> 4 4 0.05000794 0.1183096 -0.4117647 10 3.903086 3.436800 TRUE#> 5 5 0.05398874 0.1106738 -0.3132530 14 3.318236 3.433961 FALSE#> 6 6 0.05718215 0.1056542 -0.2566372 23 2.970250 3.431092 FALSEout$outmat#> sid x y#> 1 1 1.5 3.0#> 2 2 4.0 6.9#> 3 4 10.2 18.5#> 4 10 16.4 10.8`

Suppose that you would like to evaluate the regression agreement between two assays with 'Deming' method, you can use the `mcreg`

, this main function is wrapped from `mcr`

package.

`# Deming regressionfit <- mcreg( x = platelet$Comparative, y = platelet$Candidate, error.ratio = 1, method.reg = "Deming", method.ci = "jackknife")`

Like the Bland-Altman plot, as well as in regression plot, the `autoplot`

function can provide the scatter plot with a fitted line as shown below.

Based on this regression analysis, you can also estimate the bias at one or more medical decision levels.

`# absolute bias.calcBias(fit, x.levels = c(30))#> Level Bias SE LCI UCI#> X1 30 4.724429 1.378232 1.995155 7.453704# proportional bias.calcBias(fit, x.levels = c(30), type = "proportional")#> Level Prop.bias(%) SE LCI UCI#> X1 30 15.7481 4.594106 6.650517 24.84568`

Suppose that you have a target population data, and would like to compute the 95% reference interval (RI) with non-paramtric method.

`data("calcium")refInterval(x = calcium$Value, RI_method = "nonparametric", CI_method = "nonparametric")#> #> Reference Interval Method: nonparametric, Confidence Interval Method: nonparametric #> #> Call: refInterval(x = calcium$Value, RI_method = "nonparametric", CI_method = "nonparametric")#> #> N = 240#> Outliers: NULL#> Reference Interval: 9.10, 10.30#> RefLower Confidence Interval: 8.9000, 9.2000#> Refupper Confidence Interval: 10.3000, 10.4000`

Suppose that you want to see if the OxLDL assay is superior to the LDL assay through comparing two AUC of paired two-sample diagnostic assays using the standardized difference method when the margin is equal to `0.1`

. In this case, the null hypothesis is that the difference is less than `0.1`

.

`data("ldlroc")# H0 : Superiority margin <= 0.1:aucTest( x = ldlroc$LDL, y = ldlroc$OxLDL, response = ldlroc$Diagnosis, method = "superiority", h0 = 0.1)#> Setting levels: control = 0, case = 1#> Setting direction: controls < cases#> #> The hypothesis for testing superiority based on Paired ROC curve#> #> Test assay:#> Area under the curve: 0.7995#> Standard Error(SE): 0.0620#> 95% Confidence Interval(CI): 0.6781-0.9210 (DeLong)#> #> Reference/standard assay:#> Area under the curve: 0.5617#> Standard Error(SE): 0.0836#> 95% Confidence Interval(CI): 0.3979-0.7255 (DeLong)#> #> Comparison of Paired AUC:#> Alternative hypothesis: the difference in AUC is superiority to 0.1#> Difference of AUC: 0.2378#> Standard Error(SE): 0.0790#> 95% Confidence Interval(CI): 0.0829-0.3927 (standardized differenec method)#> Z: 1.7436#> Pvalue: 0.04061`

Suppose that you feel like to do the hypothesis test of `H0=0.7`

not `H0=0`

with pearson and spearman correlation analysis, the `pearsonTest()`

and `spearmanTest()`

would be helpful.

`# Pearson hypothesis testx <- c(44.4, 45.9, 41.9, 53.3, 44.7, 44.1, 50.7, 45.2, 60.1)y <- c(2.6, 3.1, 2.5, 5.0, 3.6, 4.0, 5.2, 2.8, 3.8)pearsonTest(x, y, h0 = 0.5, alternative = "greater")#> $stat#> cor lowerci upperci Z pval #> 0.5711816 -0.1497426 0.8955795 0.2448722 0.4032777 #> #> $method#> [1] "Pearson's correlation"#> #> $conf.level#> [1] 0.95# Spearman hypothesis testx <- c(44.4, 45.9, 41.9, 53.3, 44.7, 44.1, 50.7, 45.2, 60.1)y <- c(2.6, 3.1, 2.5, 5.0, 3.6, 4.0, 5.2, 2.8, 3.8)spearmanTest(x, y, h0 = 0.5, alternative = "greater")#> $stat#> cor lowerci upperci Z pval #> 0.6000000 -0.1478261 0.9656153 0.3243526 0.3728355 #> #> $method#> [1] "Spearman's correlation"#> #> $conf.level#> [1] 0.95`

That's it! That's the `mcradds`

package. More details can be found in the Introduction to mcradds vignette.

The main reference is the Chapter 22 Releasing to CRAN in https://r-pkgs.org/release.html. Follow these steps below.

Use `usethis::use_release_issue()`

to generate a listing on the github issue page to advise on a series of recommendations you should finish.

If you don't have a README document already, you should create and render `devtools::build_readme()`

it before releasing. Don't forget to add the install instructions in the README. Keep updating the NEW document as well.

A vignette is necessary that is a long-term guide to your package. Use `usethis::use_vignette("my-vignette")`

to create a default template first, and then you can just follow other mature packages's vignettes, through following the similar structure from them is okay (that's what I'm doing).

In addition, a website like pkgdown is also help for users to know more about your package. These functions from `usethis`

package can help your build it. The `usethis::use_pkgdown()`

function to initial setup, and `pkgdown::build_site()`

to render your site, then `usethis::use_pkgdown_github_pages()`

to deployment your site to github and githun action.

Check the DESCRIPTION clearly

- Proofread the title, follow the naming rule, like it should be plain text (no markup), capitalized like a title, and NOT end in a period.
- Provide a good description, which is very important.
- Check version number, updating manually or using
`usethis::use_version()`

. - Don't forget to add (copyright holder) role to
`Authors@R`

. If you are the only developer, you should add three roles and put "aut", "cre" and "cph" together. - Make sure the license is reasonable and correct.
- Add the correct urls following to the CRAN's URL checks, and check with
`urlchecker::url_check()`

.

Check and list all spell words in `inst/WORDLIST`

automatically with `usethis::use_spell_check()`

. That's a fantastic way—just a one-line command.

At last, run `devtools::check()`

once again to ensure everything is ready.

As usual, I use `devtools::check()`

to double-check all is still well before I want to merge or commit update. But before releasing, you'd better add `remote = TRUE`

and `manual = TRUE`

to run the `R CMD check`

again, like `devtools::check(remote = TRUE, manual = TRUE)`

, which will build and check the manual, and perform a number of CRAN incoming checks.

Maybe you will encounter the same problem I had, like a confused warning `pdflatex not found! Not building PDF manual`

. I didn't understand the meaning of this warning at first. I checked all options in R and Rstudio, but that didn’t work. Finally, I found that it occurred because I didn’t have the `pdflatex`

executive program on this computer!

It's easy to solve the problem if you find it. I chose to install the `pdflatex`

using the solution provided by Yihui Xie, referring to the article https://yihui.org/tinytex/.

`install.packages('tinytex')tinytex::install_tinytex()`

Another option is to add some more packages for building PDF vignettes of many CRAN packages.

`tinytex:::install_yihui_pkgs()`

At last, if it still doesn't work, ensure the path of `pdflatex`

has been added to your PATH environment on the computer.

After `R CMD check`

, you'd better use `devtools::check_win_devel()`

as this checking with r-devel is required by CRAN policy. And make sure your package can be passed through CRAN's win-builder service, which is only for Windows. Another good option is to use `rhub::check_for_cran()`

that is also a service supported by the R Consortium, to check your package.

If this package is the new submission to CRAN, there are currently no downstream dependencies for it. If not, you should do the reverse dependency checks.

`usethis::use_revdep()revdepcheck::revdep_check(num_workers = 4)`

or

`revdepcheck::cloud_check()`

After all the above, record comments about the submission to `cran-comments.md`

, and that will be created by the `usethis::use_cran_comments()`

you use at first. There is no need to manually add it.

Once you’re satisfied that all issues have been addressed and it’s time to submit your package to CRAN, run `usethis::use_version()`

to reach the final version you would like for the first release to CRAN, and then submit using `devtools::submit_cran()`

without any hesitation.

Afterwards, you will receive an email telling you that the package is pending a manual inspection of this new CRAN submission. You will get a response within the next 10 working days, but sometime the feedback is very fast.

If there are some comments from CRAN, respond to any CRAN remarks and double-check everything. Fix what needs to be fixed. If not, write and provide a good reason as you can. Don't forget to add a "Resubmission" section at the top of `cran-comments.md`

to clearly identify that the package is a resubmission, and list the changes that you have made. If you want to explain or clarify something, also can be added inside.

At last, if you receive an email telling your package will be published within 24 hours in the correponding CRAN directory, that means your package have been accepted and released on CRAN. And then you should push it to Github with the new version number. And next, use `usethis::use_github_release()`

to create a new release with tag version on your github, and then update the NEW document as well to illustrate that this is the CRAN release.

Now you can continue increasing the version number to the development version using `usethis::use_dev_version()`

. It makes sense to immediately push to GitHub so that any update will be based on the development version.

Other checking lists for CRAN are also available for reference.

]]>`officer`

）可以用于生成editable图片在PPT中。这里的editable是指图片中每个元素包括散点、X/Y轴、标签都能修改，常用于图片的再修饰参考于：Chapter 5 officer for PowerPoint

其实`officer`

是一个`Officeverse`

套件中的一个包，还包括其他大家熟悉的，如：

`officedown`

，在rmarkdown中生成word`officedown`

，生成非常好用的表格`rvg`

，生成矢量图形`mschart`

，生成macrosoft office的图形

进入正题，假如你有一个R生成的图片，可以是R基础绘图生成的，也可以是ggplot2绘图生成，或者是其他绘图R包生成（但是必须要有`ggplot`

对象），均可通过以下方式转化成在PPT中的editable图片

首先生成图片并用`rvg::dml`

函数封装成矢量图以便后续在PPT中插入到各页slides中

`library(rvg) p1 <- dml(plot(1:10))library(ggplot2)g2 <- ggplot(iris, aes(x = Sepal.Length, y = Sepal.Width)) + geom_point() + theme_classic()p2 <- dml(ggobj = g2)library(survival)library(survminer)g3 <- survfit(Surv(time, status) ~ sex, data = lung) %>% ggsurvplot(data = lung)p3 <- dml(ggobj = g3$plot)`

矢量图的对象生成后，接着根据下图的步骤添加到PPT中

先用`read_pptx`

根据默认模板生成一个空的PPT文件；然后用`add_slide`

生成一页空的slide；最后用`ph_with`

将矢量图对象插入其中。其中所涉及到的一些参数，需要先了解office PowerPoint的一些基本组件，可阅读：2.2 PowerPoint presentation properties

`library(officer)doc <- read_pptx()doc <- add_slide(doc, layout = "Title and Content", master = "Office Theme")doc <- ph_with(doc, p1, location = ph_location_fullsize() )doc <- add_slide(doc, layout = "Title and Content", master = "Office Theme")doc <- ph_with(doc, p2, location = ph_location_fullsize() )doc <- add_slide(doc, layout = "Title and Content", master = "Office Theme")doc <- ph_with(doc, p3, location = ph_location_fullsize() )print(doc, target = "test.pptx")`

最后即可打开`test.pptx`

文件修饰图片啦

In terms of the ANCOVA model, if you would like to add the margin of non-inferiority and superiority, you can just use the `lsmestimate`

statement with `testvalue=2`

when the margin is 2. Whereas for multiple imputation you can't just add this statement in the analysis step, you should define this margin in the pool step.

In order to echo the last article, here I will use the identical example data, first and second steps of the MI process, and just illustrate the difference in the third step. Assume that the endpoint is the change from baseline at week 6, and given that this drug is used to reduce the primary indicator, the null hypothesis might be that the CHG in the treatment group minus the placebo group is more than `-2`

, demonstrating that the drug efficacy is not superior to placebo.

`ods output ParameterEstimates=super; proc mianalyze data=diff theta0=-2; modeleffects estimate; stderr stderr;run;`

The combined imputation with a margin of `-2`

as following.

Now we can find the `Theta0`

value is `-2`

rather than the usual and default `0`

. And the two-sided p-value is `0.4745`

. If we would like to obtain the one-sided p-value, an additional calculation can be done. Or just a half of a two-sided p-value is also fine, which is the same.

`data super; set super; pval = (1 - probt(abs(tvalue),df));run;`

Otherwise the t-statistic and p-value can also be computed by the t distribution formula, as shown below in R.

`est <- -2.803439theta0 <- -2se <- 1.123403df <- 4800.7t <- (est - theta0) / se> t[1] -0.7151832pval <- pt(t, df)> pval[1] 0.2372653`

The superiority test is used as an example above, however non-inferiority test can follow the same procedure by simply altering the margin.

]]>There are plenty of methods that could be applied to the missing data, depending on the goal of the clinical trial. The most common and recommended is multiple imputation (MI), and other methods such as last observation carried forward (LOCF), observed case (OC) and mixed model for repeated measurement (MMRM) are also available for sensitivity analysis.

Multiple imputation is a model based method, but it's not just the model to impute data, it's also a framework with the implementation of various analytic models like ANCOVA. In general, there are 3 steps to implement MI, where R and SAS are all the same.

- Imputation, generating M datasets with imputed. But before starting this step, you'd better examine the missing pattern, Monotone missing data pattern, or Arbitrary pattern of missing data.
- Analysis, generating M sets of estimates from M imputed datasets using the statistical model.
- Pooling, the M sets of estimates will be combined into one MI estimate. This is very different from other imputation methods as it not only imputes missing values but also outputs the estimated value from multiple imputed datasets. The pooling method is Rubin's Rules (RR), which can pool parameter estimates such as mean differences, regression coefficients and standard errors, and then derive confidence intervals and p-values. The pool logic will be briefly introduced below.

After the routine introduction of MI, let's talk about how to implement the MI model to deal with actual missing data in SAS. I'm also planning to compare the SAS procedure with the `rbmi`

R package in the next article. To be honest, I tend to use R instead of SAS in my actual work, so I would like to introduce more R use in clinical trials.

Here is an example dataset from an antidepressant clinical trial of an active drug versus placebo. The relevant endpoint is the Hamilton 17-item depression rating scale (HAMD17) which was assessed at baseline and at weeks 1, 2, 4, and 6. This example comes from the `rbmi`

package so that I can use the same dataset in R programming. But I do the pre-processing and transposing to meet the data type of the MI procedure in SAS.

`library(rbmi)library(tidyverse)data("antidepressant_data")dat <- antidepressant_data# Use expand_locf to add rows corresponding to visits with missing outcomes to the datasetdat <- expand_locf( dat, PATIENT = levels(dat$PATIENT), # expand by PATIENT and VISIT VISIT = levels(dat$VISIT), vars = c("BASVAL", "THERAPY"), # fill with LOCF BASVAL and THERAPY group = c("PATIENT"), order = c("PATIENT", "VISIT"))dat2 <- pivot_wider( dat, id_cols = c(PATIENT, THERAPY, BASVAL), names_from = VISIT, names_prefix = "VISIT", values_from = HAMDTL17)write.csv(dat2, file = "./antidepressant.csv", na = "", row.names = F)`

And then import the csv file into SAS, as shown below.

Next, we should examine the missing pattern in the dataset by using zero imputation (`nimpute=0`

).

`proc mi data=antidepressant nimpute=0; var BASVAL VISIT4 VISIT5 VISIT6 VISIT7;run;`

The above graph indicates that there is a patient who doesn’t fit the monotone missing data pattern, so the missing pattern is non-monotone. With regard to which MI method should be performed, the MISSING DATA PATTERNS section and Table 4 from this article(MI FOR MI, OR HOW TO HANDLE MISSING INFORMATION WITH MULTIPLE IMPUTATION) can be used as references. I will select the MCMC (Markov Chain Monte Carlo) method for the multiple imputation afterwards.

Then starting the first step, here I choose the MCMC full-data imputation with `impute=full`

and specify the BY statement to obtain the separate imputed datasets in the treatment group. And I also specify the seed number as well, but keep in mind that it is only defined in the first group; the second group is not the seed number you define but another fixed and random seed number following the seed number you have.

`proc sort data=antidepressant; by THERAPY; run;proc mi data=antidepressant seed=12306 nimpute=100 out=imputed_data; mcmc chain=single impute=full initial=em (maxiter=1000) niter=1000 nbiter=1000; em maxiter=1000; by THERAPY; var BASVAL VISIT4 VISIT5 VISIT6 VISIT7;run;`

The second step is to implement the analysis model for each imputation datasets that were created from the first step. Assume that I want to estimate the endpoint of change from baseline at week 6 with the LSmean in each treatment by ANCOVA model, and the difference between them. It will include the fixed effect of treatment and fixed covariate of baseline.

`data imputed_data2; set imputed_data; CHG = VISIT7 - BASVAL;run;proc sort; by _imputation_; run;ods output lsmeans=lsm diffs=diff; proc mixed data=imputed_data2; by _imputation_; class THERAPY; model CHG=BASVAL THERAPY /ddfm=kr; lsmeans THERAPY / cl pdiff diff;run;`

We can see the LS mean for each imputation (`_Imputation_`

) in the `lsm`

dataset where each imputation has two rows including drug and placebo, and the difference between two groups in the `diff`

dataset as shwon below.

The third step is to pool all estimates from the second step, including the LS mean estimates and difference.

`proc sort data=lsm; by THERAPY; run;ods output ParameterEstimates=combined_lsm; proc mianalyze data=lsm; by THERAPY; modeleffects estimate; stderr stderr;run;ods output ParameterEstimates=combined_diff; proc mianalyze data=diff; by THERAPY _THERAPY; modeleffects estimate; stderr stderr;run;`

For now the imputations have been combined as shown below.

Above results indicate that each imputation has been combined and the final estimate is calculated by the Rubin's Rules (RR). The t-statistic, confidence interval and p-value are based on t distribution, so the most important is how to calculate the estimate and standard error. The pooled estimate is the mean value of all imputation's estimates. And the pooled SE is the square root of `Vtotal`

that can be calculated through formulas of 9.2-9.4. The formulas is cited from https://bookdown.org/mwheymans/bookmi/rubins-rules.html.

I'm trying to illustate the computing process of RR in R with the `diff`

dataset as example. Using R is as it's easy for me to do the matrix operations.

`diff <- haven::read_sas("./diff.sas7bdat")est <- mean(diff$Estimate)> est[1] -2.803439n <- nrow(diff)Vw <- mean(diff$StdErr^2)Vb <- sum((diff$Estimate - est)^2) / (n - 1)Vtotal <- Vw + Vb + Vb / nse <- sqrt(Vtotal)> se[1] 1.123403`

This `est`

and `se`

value is equal to the pooled Estimate of `-2.803439`

and StdErr of `1.123403`

in SAS.

With the interest of other parameters, you may ask how are the DF and t-statistics calculated? I recommend reading the entire article as mentioned above to comprehend the complete process that is not introduced here. Once the DF and t-statistics are determined, the confidence interval and p-value can be also computed easily by T distribution.

Multiple imputation is a recommended and useful tool in trial use, which provides robust parameter estimates depending on which missing pattern your data has.

In the next article, I will try to illustrate how to use MI in non-inferiority and superiority trials.

Multiple Imputation using SAS and R Programming

MI FOR MI, OR HOW TO HANDLE MISSING INFORMATION WITH MULTIPLE IMPUTATION

Chapter9 Rubin's Rules

We all know that there are two common methods to compute the confidence limit for a Hazard Ratio in the SAS `PHREG`

procedure. - Wald's Confidence Limits - Profile-Likelihood Confidence Limits

However, in R, we commonly use the `confint()`

or `summary()`

function to compute the CI from the `coxph`

model, which assumes normality. So it is identical to Wald's CI.

You can also compute it manually from the `EST ± SE * Z`

, as shown below.

`m <- coxph(Surv(time, status) ~ ph.ecog , data=na.omit(lung))ss <- summary(m)coef <- coef(m)se <- ss$coefficients[,"se(coef)"]c(exp(coef - qnorm(0.975) * se), exp(coef + qnorm(0.975) * se))`

But what's the weakness of Wald's CI? Refer to Why and When to Use Profile Likelihood Based Confidence Intervals

This blog says that since the standard errors of the general linear model are based on asymptotic variance, they may not be a good estimator of standard error for small samples. In particular, Wald Confidence Intervals may not perform very well. One should only use the Wald Confidence Interval if the likelihood function is symmetric about the MLE.

So what's the superiority of the Profile Likelihood CI?

In cases where the likelihood function is not symmetric about the MLE, the Profile Likelihood Based Confidence Interval serves better. This is because the Profile Likelihood Based Confidence Interval is based on the asymptotic chi-square distribution of the log likelihood ratio test statistic.

If you use the SAS `PHREG`

procedure, you can just simply define the `lr`

argument as `pl`

to get the Profile Likelihood CI for the hazard ratio.

Unfortunately you cannot get it in `coxph()`

function from the `survival`

package. I have tried the `coxphf()`

function from `coxphf`

package, but the CI is not identical to SAS with a little difference in a few decimal places.

`m2 <- coxphf::coxphf(formula=Surv(time, status) ~ ph.ecog, pl=FALSE, data=na.omit(lung))summary(m2)`

Anyway, this is an alternative way to compute the Profile Likelihood CI.

]]>其主要介绍了4种推断方法

- One-Proportion Inference
- One-Mean Inference
- Two-proportion inference
- Two-mean inference

当你想测试某个人群比例不等于某个固定比例，那么你可以使用**One-Proportion Inference**。

比如你推断在一些教堂中女性的比例超过55%。这里的55%就是我们需要推断的proportion，hypotheses则是`H0: pi=0.55`

和`Ha: pi>0.55`

。然后我们发现有一个教堂中100名人员里面有62个女性，那么`phat=62/100`

。此时我们是否有足够的证据推翻原假设？毕竟我们只是在一个教堂的抽样数据，所以我们得借助simulation，并假设原假设为真来计算假设检验的P值。因此我们借助`do`

和`rflip`

函数构建`pi=0.55`

的1000个模拟trials

`library(mosaic)pi <- 0.55 # probability of success for each tossn <- 100 # Number of times we toss the penny (sample size)trials <- 1000 # Number of trials (number of samples)observed <- 62 # Observed number of heads phat = observed / n # p-hat - the observed proportion of headsdata.sim <- do(trials) * rflip(n, prob = pi)`

然后计算在这1000次模拟中，出现proportion大于`phat`

的个数，除以总trials数就是我们想得到的"P值"。比较易于理解，由于是模拟所以得出来的，所以跟原本中的数值肯定是不会一样，但当模拟次数更大后，每次模拟的结果将会趋于一个较为稳定的值

`pvalue <- sum(data.sim$prop >= phat) / trials`

当你推测的值不是proportion而是一个mean值时，则可以考虑用**One-Mean Inference**。

比如你推测一辆汽车每加仑汽油的平均行驶里程不等于22英里。这里的`22`

就是我们需要推断的mean，hypotheses则是`H0: μ=22`

和`Ha: μ≠22`

。然后我们观测到在数据集`mtcars`

中每加仑汽油的平均行驶里程（`mpg`

）的mean为20.09。这时我们可以对`mtcars`

数据集进行重抽样来推断上述假设。

`mu <- 22observed <- mean(~mpg, data=mtcars, na.rm=T)paste("Observed value for sample mean: ", observed)trials <- 1000samples <- do(trials) * mosaic::mean(~mpg, data=resample(mtcars))`

基于`samples`

模拟数据，我们可以粗略计算下95%置信区间。

`# Let's compute a 95% Confidence Interval (ci <- quantile(samples$mean,c(0.025,0.975)))# 2.5% 97.5% # 18.18406 22.23453`

从置信区间可看出，其包含了我们H0假设的`μ=22`

，所提可以初步判断原假设成立，即拒绝了汽车每加仑汽油的平均行驶里程不等于`22`

这个推测。

接着根据重抽样的数据再次计算`mpg`

的均值大于`22`

的比例，由于是双侧假设，所以P值最终需要乘以2。

`pvalue <- sum(samples$mean >= mu) / trialspaste("Two-sided p-value is", 2 * pvalue)# [1] "Two-sided p-value is 0.088"`

当你推测的是两个proportion之间是否有显著不同时，则可以考虑用**Two-proportion inference**。

比如你推测支持某项政策的女性比例与支持该政策的男性比例有不同。这里两个比例就是我们所需要比较的，hypotheses则是`H0: π1=π2`

和`Ha: π1≠π2`

。这时我们观测到某个样本里`p1=62/100`

女性支持该政策，而男性则是`p2=51/100`

，数据如下：

`df <- rbind( do(38) * data.frame(Group = "Men", Support = "no"), do(62) * data.frame(Group = "Men", Support = "yes"), do(49) * data.frame(Group = "Women", Support ="no"), do(51) * data.frame(Group = "Women", Support = "yes") )(df.summary <- tally(Support ~ Group, data=df))`

接着计算两组之间proportion的差值，简单点就是`0.62-0.51=0.11`

，或者

`observed <- diffprop(Support ~ Group, data = df)paste("Observed difference in proportions: " , round(observed,3))# [1] "Observed difference in proportions: 0.11"`

然后使用打乱分组信息的方式做模拟，看看打乱后两组的差异的null distribution

`trials <- 1000null.dist <- do(trials) * diffprop(Support ~ shuffle(Group), data=df)histogram( ~ diffprop, data= null.dist, xlab = "Differences in proportions", main = "Null distribution for differences in proportions", v= observed)`

最后从null distribution中计算P值来确认是否能推翻原假设，也就是说假如H0假设是成立，那么差值大于`0.11`

（或更大更极端的值）的概率是多少，是否很小（如小于0.05）。个人理解若P值很大，则可以推翻原假设。

`p.value <- prop(~diffprop >= observed, data= null.dist)paste(" One-sided p-value: ", round(p.value,3))# [1] " One-sided p-value: 0.088"`

当你推测的是两个mean之间是否有显著不同时，则可以考虑用**Two-mean inference**

比如推测在长鳍金枪鱼（albacore）和黄鳍金枪鱼（yellowfin）中发现的汞的平均含量有不同。这里两个均值就是我们所需要比较的，hypotheses则是`H0: μ1=μ2`

和`Ha: μ1≠μ2`

。然后我们观测到在数据集`tuna.txt`

中albacore mean为`0.35763`

，而yellowfin mean为`0.35444`

，两者的差值为`-0.003`

。这时我们可以对`tuna`

数据集进行打乱分组信息来模拟并推断上述假设。

`df <- read.delim("http://citadel.sjfc.edu/faculty/ageraci/data/tuna.txt")str(df)favstats(~Mercury | Tuna, data=df)observed <- diffmean(~Mercury | Tuna, data=df, na.rm = T)paste("Observed difference in the means: ", round(observed, 3 ))# [1] "Observed difference in the means: -0.003"`

接着就类似于`Two-proportion inference`

，用`shuffle`

函数打乱分组，模拟1000次，然后计算P值来判断当原假设成立前提下该值是否极端，最后看是否能推翻原假设。

`trials <- 1000null.dist <- do(trials) * diffmean(Mercury ~ shuffle(Tuna), data=df, na.rm = T)pvalue <- prop(~ diffmean <= observed, data=null.dist)paste("The one-sided p-value is ", round(pvalue,3))# [1] "The one-sided p-value is 0.428"`

以上是对于参考资料的一个简单记录，simulation是一个非常有意思的方法，在临床试验中也较为常见，值得后续继续学习，本文所介绍的模拟是一个非常简单也易于理解的范例。

Chapter 7 Simulation-based Inference

Simulation-based inference with mosaic

MOSAIC R packages

Here we don't talk about how to determine which type of missingness your data have, you can refer to the articles Multiple Imputation.

Or a summary (Missing data assumptions and corresponding imputation methods) in Multiple imputation as a valid way of dealing with missing data

Let's keep it more practical and focus on how to impute missing data. For example, LOCF (Last Observation Carry Forward) is the standard method for imputing missing data in clinical trial studies. It is used to fill in missing values at a later point in the study, but that can lead to biased results. Other methods such as BOCF(Baseline Observation Carry Forward), WOCF(Worsts Observation Carry forward), and Multiple Imputation are also used, but rarely seen in oncology studies. The last common method like MMRM(Mixed-Effect Model Repeated Measure) is used for continuous missing data.

Given SAS is still the dominant delivery program, here I will record how to use SAS to handle this missing data. However I also perfectly suggest using R as the alternative program or QC program, as I believe R will be accepted by regulatory authorities, at least as an optional delivery program. Therefore I'm gonna record how to use the `rbmi`

package to deal with missing data like LOCF and multiple imputation, and compute the LS means with ANCOVA model in another article.

Here we create a dummy dataset with 3 columns: `usubjid`

, `avisitn`

and `aval`

.

`data data; input usubjid $8. avisitn aval; datalines;1001-101 0 851001-101 1 841001-101 2 861001-101 3 .1001-101 4 .1001-101 5 851002-101 0 901002-101 1 .1002-101 2 911002-101 3 921002-101 4 .1002-101 5 .;run;proc sort; by usubjid avisitn; run;`

Actually there are several methods to implement the LOCF, referring to LOCF-Different Approaches, Same Results Using LAG Function, RETAIN Statement, and ARRAY Facility. I usually use the `RETAIN`

statement as it's easy to understand and also very elegant. So this brings us to the final code, once the `usubjid`

changes, the `rn`

variable will be initialized to null(.) or first `aval`

grouped by `usubjid`

. And then through the `if`

statement to check if the next aval is not missing, and carry the `rn`

forward to the next aval.

`data locf; length dtype $10.; retain rn; set data; by usubjid avisitn; if first.usubjid then rn=.; if aval ne . then do; rn=aval; aval_locf=aval; end; else do; aval_locf=rn; dtype="LOCF"; end;run;`

We can see the final dataset below with LOCF'ed variable, `aval_LOCF`

.

And the BOCF and WOCF methods are also conservative like LOCF, and their programming logic is roughly the same. The former one can be used when subjects drop out due to Adverse Event, while the latter one can be used for lack of efficacy(LOE) indeed.

For Multiple Imputation(MI), it's more robust than LOCF, as it has multiple imputations.

The procedures for Multiple Imputation are generally the same in both SAS and R, such as:

- Impution, the missing data is imputed
`m`

times and generates`m`

complete datasets with a specified model or distribution. - Analysis, each of these datasets is analyzed using a certain statistical model or function, and generating
`m`

sets of estimates. - Pooling, the
`m`

sets of estimates are combined to one MI result with an appropriate method, like Rubin´s Rules (RR) that is specifically designed to pool parameter estimates and is also wrapped into SAS and R packages.

These procedures are easy to understand, so how to implement them?

- In SAS, you can use
`proc mi`

procedure for imputation, select one statistical model, such as`proc mixed`

for analysis, and lastly use`proc mianalyze`

procedure for pooling. - In R, although there are several R packages available for use, I personally prefer using
`mice`

and`rmbi`

packages, which will be introduced in other articles.

Compared to the above two methods, the MMRM (Mixed-effect Model for Repeated Measures) method does not do the imputation for individual missing data, while treating each individual as a random effect, as it has already considered the missing data in the model(that the missing data is implicitly imputed).

So it can be seen that MMRM does well in controlling type I error but LOCF may lead to the inflation of type I error. Although the MI method can also control Type I error, it is more conservative than MMRM because it will underestimate the treatment effect.

Actually there is really impressive article that talks about the comparison of MMRM versus MI, as well as the regulatory authorities' considerations on this topic. Referring to it would be quite helpful. Handling of Missing Data: Comparison of MMRM (mixed model repeated measures) versus MI (multiple imputation).

- In SAS, you can simply use
`proc mixed`

procedure using mixed model with maximum likelihood-based method. - In R, the
`nlme`

package is commonly used, but the new`mmrm`

package offers advanced functionality (just heard before...).

Multiple Imputation

SAS LOCF For Multiple Variables

SAS LOCF

LOCF-Different Approaches, Same Results Using LAG Function, RETAIN Statement, and ARRAY Facility

LOCF Method and Application in Clinical Data Analysis

临床试验中缺失数据的预防与处理

Handling of Missing Data: Comparison of MMRM (mixed model repeated measures) versus MI (multiple imputation)

`ggplot2`

.I encounter this question when I want to construct two different color ranges to `col`

aesthetic, such as `geom_line`

and `geom_text`

. Sometimes I may choose another way to visualize the data to avoid this situation, but I really want to know how to solve it if I have to use this color strategy.

From my Google search. I have found the best solution and it must be thanks to Elio Campitelli’s contribution, the author of the `ggnewscale`

package. He demonstrates how to implement two color scales and explain what the principle is. You can refer to this article, Multiple color (and fill) scales with ggplot2.

Here I just show how it works. Firstly we prepare the dummy data.

`library(tidyverse)library(ggnewscale)set.seed(123)data <- tibble( id = rep(1:5, each = 4), day = sample(5:20, 20, replace = TRUE), linecol = str_c("col", id), day2 = day + 2, label = rep(c("Group1", "Group2"), each = 10))`

And then I’d like to draw a line plot with labels around it. The line colors are determined by the `linecol`

variable, while label colors are by `label`

group. Let's look at the error example that doesn’t work with no surprise.

`data %>% ggplot(aes(x = day, y = id)) + geom_line(aes(col = linecol)) + scale_color_manual(values = c("red", "orange", "yellow", "green", "blue")) + geom_text(aes(label = label, col = label)) + scale_color_manual(values = c("blue", "orange"), guide = NULL) # ErrorScale for colour is already present.Adding another scale for colour, which will replace the existing scale.Error in `palette()`:! Insufficient values in manual scale. 7 needed but only 2 provided.`

In order to solve it, you just need to add one line code as mentioned in that reference article. `structure(ggplot2::standardise_aes_names("colour"), class = "new_aes")`

or the function `new_scale_color()`

wrapped in `ggnewscale`

package. If you want to use `scale_fill_*`

, replacing "colour" to "fill" is fine.

So the final code without any error is shown below.

`data %>% ggplot(aes(x = day, y = id)) + geom_line(aes(col = linecol)) + scale_color_manual(values = c("red", "orange", "yellow", "green", "blue")) + structure(ggplot2::standardise_aes_names("colour"), class = "new_aes") + # new_scale_color() + geom_text(aes(label = label, col = label)) + scale_color_manual(values = c("blue", "orange"), guide = NULL)`

There is no doubt that this solution is not very common and formal. And I really hope it can be merged into `ggplot2`

big family so that I only need to import one package. Ah!