In the article Definition of least-squares means (LS means), we have known how to compute the LS mean step by step and how to implement it in the emmeans
package that will calculate the estimated mean value for different factor variables and assume the mean value for continuous variables.
In addition, emmeans
also contains a set of functions not limited for contrasts and hypothesis testing that are commonly used in clinical trial statistical analysis, such as ANCOVA and MMRM. So the goal of this article is not only to know how to use the emmeans
package to answer these questions but also to learn several separate steps for each of them.
Let's start by fitting a model first. Given that I have an example of data fev_data
from mmrm
package, and then fit a simple ANCOVA model by lm
function.
library(tidyverse)
library(tidymodels)
library(mmrm)
library(emmeans)
fit <- fev_data %>%
filter(AVISIT == "VIS4" & !is.na(FEV1)) %>%
lm(formula = FEV1 ~ ARMCD)
tidy(fit)
## # A tibble: 2 × 5
## term estimate std.error statistic p.value
## <chr> <dbl> <dbl> <dbl> <dbl>
## 1 (Intercept) 47.8 1.23 38.8 4.73e-74
## 2 ARMCDTRT 4.83 1.74 2.77 6.33e- 3
I have known how to compute the LS means, but here we can learn the process of how the SE
is calculated. And more details can be found in the link of https://bookdown.org/dereksonderegger/571/4-contrasts.html.
So we can calculate the LS mean
estimate, SE
and corresponding confidence interval as shown below. Let's try to focus on the TRT
group.
X <- model.matrix(fit)
sigma.hat <- glance(fit) %>% pull(sigma)
beta.hat <- tidy(fit) %>% pull(estimate)
XtX.inv <- solve(t(X) %*% X)
# contrast for TRT ARM
cont <- c(1, 1)
est <- t(cont) %*% beta.hat
std.err <- sigma.hat * sqrt(t(cont) %*% XtX.inv %*% cont)
df <- glance(fit) %>% pull(df.residual)
q <- qt(1 - 0.05 / 2, df)
ci <- c(est) + c(-1, 1) * q * c(std.err)
setNames(c(est, std.err, df, ci), c("est", "SE", "df", "lower.ci", "upper.ci"))
## est SE df lower.ci upper.ci
## 52.592798 1.230847 132.000000 50.158062 55.027535
The same results can be obtained by calling the emmeans()
function.
ems <- emmeans(fit, ~ARMCD)
ems
## ARMCD emmean SE df lower.CL upper.CL
## PBO 47.8 1.23 132 45.3 50.2
## TRT 52.6 1.23 132 50.2 55.0
Next, we will create a contrast that is a linear combination of the means. In the following example, the contrast may answer the question of weather the treatment (TRT
) produces a significant effect than placebo, like contrast=TRT-PRB
.
# TRT vs. PBO: TRT - PBO
k <- c(-1, 1)
And then compute the estimation and standard error for the contrast. The basic principle of SE
is that the variance of a linear combination of independent estimates is equal to the linear combination of their variances.
est <- tidy(ems) %>% pull(estimate)
se <- tidy(ems) %>% pull(std.error)
con_est <- con %*% est
con_se <- sqrt(con^2 %*% se^2)
Since we have got the estimation(con_est
) and standard error(con_se
) above, naturally we can use them to compute the confidence interval and p value following the t distribution with assuming the null hypothesis is TRT - PBO = 0
df <- tidy(ems) %>% pull(df) %>% unique()
t <- con_est[1, 1] / con_se[1, 1]
q <- qt(1 - 0.05 / 2, df)
ci <- con_est[1, 1] + c(-1, 1) * q * c(con_se)
pval <- 2 * pt(t, df, lower.tail = FALSE)
tibble(
est = con_est[1,1],
se = con_se[1,1],
df = df,
t = t,
lower.ci = ci[1],
upper.ci = ci[2],
pval = pval
)
## # A tibble: 1 × 7
## est se df t lower.ci upper.ci pval
## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 4.83 1.74 132 2.77 1.39 8.27 0.00633
We can also obtain the same results from the below code straightforwardly.
contr <- contrast(ems, method = list(k), adjust = "none")
contr
## contrast estimate SE df t.ratio p.value
## c(-1, 1) 4.83 1.74 132 2.774 0.0063
confint(contr)
## contrast estimate SE df lower.CL upper.CL
## c(-1, 1) 4.83 1.74 132 1.39 8.27
## Confidence level used: 0.95
Suppose that you want to do the hypothesis test that treatment is superior to placebo with a margin of 2
, just add a small change to the t statistic.
t2 <- (con_est[1, 1] - 2) / con_se[1, 1]
pval <- pt(t2, df, lower.tail = FALSE)
pval
## [1] 0.05322456
The same process can be implemented by the emmeans::test()
function.
test(contr, null = 2, side = ">", adjust = "none")
## contrast estimate SE df null t.ratio p.value
## c(-1, 1) 4.83 1.74 132 2 1.625 0.0532
## P values are right-tailed
The above is only a simple example of a 1-way ANOVA, so that I can learn and understand it clearly. Actually, other complicated models and contrasts can be processed as well. Through these step by step computations, we can gain deeper thoughts of why we chose the functions, and what's the nature of our computation.
Referenece
Chapter 4 Contrasts
Chapter 9 Contrasts and multiple comparison testing
Chapter 6 Beginning to Explore the emmeans package for post hoc tests and contrasts
My notes on using {emmeans}
Confidence intervals and tests in emmeans