# KeepNotes blog

Stay hungry, Stay Foolish.

0%

In the article Definition of least-squares means (LS means), we have known how to compute the LS mean step by step and how to implement it in the `emmeans` package that will calculate the estimated mean value for different factor variables and assume the mean value for continuous variables.

In addition, `emmeans` also contains a set of functions not limited for contrasts and hypothesis testing that are commonly used in clinical trial statistical analysis, such as ANCOVA and MMRM. So the goal of this article is not only to know how to use the `emmeans` package to answer these questions but also to learn several separate steps for each of them.

Let's start by fitting a model first. Given that I have an example of data `fev_data` from `mmrm` package, and then fit a simple ANCOVA model by `lm` function.

``````library(tidyverse)
library(tidymodels)
library(mmrm)
library(emmeans)

fit <- fev_data %>%
filter(AVISIT == "VIS4" & !is.na(FEV1)) %>%
lm(formula = FEV1 ~ ARMCD)
tidy(fit)

## # A tibble: 2 × 5
##   term        estimate std.error statistic  p.value
##   <chr>          <dbl>     <dbl>     <dbl>    <dbl>
## 1 (Intercept)    47.8       1.23     38.8  4.73e-74
## 2 ARMCDTRT        4.83      1.74      2.77 6.33e- 3``````

I have known how to compute the LS means, but here we can learn the process of how the `SE` is calculated. And more details can be found in the link of https://bookdown.org/dereksonderegger/571/4-contrasts.html.

So we can calculate the `LS mean` estimate, `SE` and corresponding confidence interval as shown below. Let's try to focus on the `TRT` group.

``````X <- model.matrix(fit)
sigma.hat <- glance(fit) %>% pull(sigma)
beta.hat <- tidy(fit) %>% pull(estimate)
XtX.inv <- solve(t(X) %*% X)

# contrast for TRT ARM
cont <- c(1, 1)
est <- t(cont) %*% beta.hat
std.err <- sigma.hat * sqrt(t(cont) %*% XtX.inv %*% cont)
df <- glance(fit) %>% pull(df.residual)
q <- qt(1 - 0.05 / 2, df)
ci <- c(est) + c(-1, 1) * q * c(std.err)

setNames(c(est, std.err, df, ci), c("est", "SE", "df", "lower.ci", "upper.ci"))

##        est         SE         df   lower.ci   upper.ci
##  52.592798   1.230847 132.000000  50.158062  55.027535``````

The same results can be obtained by calling the `emmeans()` function.

``````ems <- emmeans(fit, ~ARMCD)
ems

##  ARMCD emmean   SE  df lower.CL upper.CL
##  PBO     47.8 1.23 132     45.3     50.2
##  TRT     52.6 1.23 132     50.2     55.0``````

Next, we will create a contrast that is a linear combination of the means. In the following example, the contrast may answer the question of weather the treatment (`TRT`) produces a significant effect than placebo, like `contrast=TRT-PRB`.

``````# TRT vs. PBO: TRT - PBO
k <- c(-1, 1)``````

And then compute the estimation and standard error for the contrast. The basic principle of `SE` is that the variance of a linear combination of independent estimates is equal to the linear combination of their variances.

``````est <- tidy(ems) %>% pull(estimate)
se <- tidy(ems) %>% pull(std.error)
con_est <- con %*% est
con_se <- sqrt(con^2 %*% se^2)``````

Since we have got the estimation(`con_est`) and standard error(`con_se`) above, naturally we can use them to compute the confidence interval and p value following the t distribution with assuming the null hypothesis is `TRT - PBO = 0`

``````df <- tidy(ems) %>% pull(df) %>% unique()
t <- con_est[1, 1] / con_se[1, 1]
q <- qt(1 - 0.05 / 2, df)
ci <- con_est[1, 1] + c(-1, 1) * q * c(con_se)
pval <- 2 * pt(t, df, lower.tail = FALSE)

tibble(
est = con_est[1,1],
se = con_se[1,1],
df = df,
t = t,
lower.ci = ci[1],
upper.ci = ci[2],
pval = pval
)

## # A tibble: 1 × 7
##     est    se    df     t lower.ci upper.ci    pval
##   <dbl> <dbl> <dbl> <dbl>    <dbl>    <dbl>   <dbl>
## 1  4.83  1.74   132  2.77     1.39     8.27 0.00633``````

We can also obtain the same results from the below code straightforwardly.

``````contr <- contrast(ems, method = list(k), adjust = "none")
contr

##  contrast estimate   SE  df t.ratio p.value
##  c(-1, 1)     4.83 1.74 132   2.774  0.0063

confint(contr)

##  contrast estimate   SE  df lower.CL upper.CL
##  c(-1, 1)     4.83 1.74 132     1.39     8.27

## Confidence level used: 0.95 ``````

Suppose that you want to do the hypothesis test that treatment is superior to placebo with a margin of `2`, just add a small change to the t statistic.

``````t2 <- (con_est[1, 1] - 2) / con_se[1, 1]
pval <- pt(t2, df, lower.tail = FALSE)
pval

## [1] 0.05322456``````

The same process can be implemented by the `emmeans::test()` function.

``````test(contr, null = 2, side = ">", adjust = "none")

##  contrast estimate   SE  df null t.ratio p.value
##  c(-1, 1)     4.83 1.74 132    2   1.625  0.0532

## P values are right-tailed ``````

The above is only a simple example of a 1-way ANOVA, so that I can learn and understand it clearly. Actually, other complicated models and contrasts can be processed as well. Through these step by step computations, we can gain deeper thoughts of why we chose the functions, and what's the nature of our computation.