Contrasts and Hypothesis Tests of emmeans

In the article Definition of least-squares means (LS means), we have known how to compute the LS mean step by step and how to implement it in the emmeans package that will calculate the estimated mean value for different factor variables and assume the mean value for continuous variables.

In addition, emmeans also contains a set of functions not limited for contrasts and hypothesis testing that are commonly used in clinical trial statistical analysis, such as ANCOVA and MMRM. So the goal of this article is not only to know how to use the emmeans package to answer these questions but also to learn several separate steps for each of them.

Let's start by fitting a model first. Given that I have an example of data fev_data from mmrm package, and then fit a simple ANCOVA model by lm function.

library(tidyverse)
library(tidymodels)
library(mmrm)
library(emmeans)

fit <- fev_data %>%
  filter(AVISIT == "VIS4" & !is.na(FEV1)) %>%
  lm(formula = FEV1 ~ ARMCD)
tidy(fit)

## # A tibble: 2 × 5
##   term        estimate std.error statistic  p.value
##   <chr>          <dbl>     <dbl>     <dbl>    <dbl>
## 1 (Intercept)    47.8       1.23     38.8  4.73e-74
## 2 ARMCDTRT        4.83      1.74      2.77 6.33e- 3

I have known how to compute the LS means, but here we can learn the process of how the SE is calculated. And more details can be found in the link of https://bookdown.org/dereksonderegger/571/4-contrasts.html.

So we can calculate the LS mean estimate, SE and corresponding confidence interval as shown below. Let's try to focus on the TRT group.

X <- model.matrix(fit)
sigma.hat <- glance(fit) %>% pull(sigma)
beta.hat <- tidy(fit) %>% pull(estimate)
XtX.inv <- solve(t(X) %*% X)

# contrast for TRT ARM
cont <- c(1, 1)
est <- t(cont) %*% beta.hat
std.err <- sigma.hat * sqrt(t(cont) %*% XtX.inv %*% cont)
df <- glance(fit) %>% pull(df.residual)
q <- qt(1 - 0.05 / 2, df)
ci <- c(est) + c(-1, 1) * q * c(std.err)

setNames(c(est, std.err, df, ci), c("est", "SE", "df", "lower.ci", "upper.ci"))

##        est         SE         df   lower.ci   upper.ci 
##  52.592798   1.230847 132.000000  50.158062  55.027535

The same results can be obtained by calling the emmeans() function.

ems <- emmeans(fit, ~ARMCD)
ems

##  ARMCD emmean   SE  df lower.CL upper.CL
##  PBO     47.8 1.23 132     45.3     50.2
##  TRT     52.6 1.23 132     50.2     55.0

Next, we will create a contrast that is a linear combination of the means. In the following example, the contrast may answer the question of weather the treatment (TRT) produces a significant effect than placebo, like contrast=TRT-PRB.

# TRT vs. PBO: TRT - PBO
k <- c(-1, 1)

And then compute the estimation and standard error for the contrast. The basic principle of SE is that the variance of a linear combination of independent estimates is equal to the linear combination of their variances.

est <- tidy(ems) %>% pull(estimate)
se <- tidy(ems) %>% pull(std.error)
con_est <- con %*% est
con_se <- sqrt(con^2 %*% se^2)

Since we have got the estimation(con_est) and standard error(con_se) above, naturally we can use them to compute the confidence interval and p value following the t distribution with assuming the null hypothesis is TRT - PBO = 0

df <- tidy(ems) %>% pull(df) %>% unique()
t <- con_est[1, 1] / con_se[1, 1]
q <- qt(1 - 0.05 / 2, df)
ci <- con_est[1, 1] + c(-1, 1) * q * c(con_se)
pval <- 2 * pt(t, df, lower.tail = FALSE)

tibble(
  est = con_est[1,1],
  se = con_se[1,1],
  df = df,
  t = t,
  lower.ci = ci[1],
  upper.ci = ci[2],
  pval = pval
)

## # A tibble: 1 × 7
##     est    se    df     t lower.ci upper.ci    pval
##   <dbl> <dbl> <dbl> <dbl>    <dbl>    <dbl>   <dbl>
## 1  4.83  1.74   132  2.77     1.39     8.27 0.00633

We can also obtain the same results from the below code straightforwardly.

contr <- contrast(ems, method = list(k), adjust = "none")
contr

##  contrast estimate   SE  df t.ratio p.value
##  c(-1, 1)     4.83 1.74 132   2.774  0.0063


confint(contr)

##  contrast estimate   SE  df lower.CL upper.CL
##  c(-1, 1)     4.83 1.74 132     1.39     8.27

## Confidence level used: 0.95

Suppose that you want to do the hypothesis test that treatment is superior to placebo with a margin of 2, just add a small change to the t statistic.

t2 <- (con_est[1, 1] - 2) / con_se[1, 1]
pval <- pt(t2, df, lower.tail = FALSE)
pval

## [1] 0.05322456

The same process can be implemented by the emmeans::test() function.

test(contr, null = 2, side = ">", adjust = "none")

##  contrast estimate   SE  df null t.ratio p.value
##  c(-1, 1)     4.83 1.74 132    2   1.625  0.0532

## P values are right-tailed

The above is only a simple example of a 1-way ANOVA, so that I can learn and understand it clearly. Actually, other complicated models and contrasts can be processed as well. Through these step by step computations, we can gain deeper thoughts of why we chose the functions, and what's the nature of our computation.

Referenece

Chapter 4 Contrasts
Chapter 9 Contrasts and multiple comparison testing
Chapter 6 Beginning to Explore the emmeans package for post hoc tests and contrasts
My notes on using {emmeans}
Confidence intervals and tests in emmeans