Response Rate and Odd Ratio in R and SAS

As we know, the objective response rate (ORR) is used as a key endpoint to demonstrate the efficacy of a treatment in oncology and is also valuable for clinical decision making in phase I-II trials, especially in single-arm trials.

The advantage of the ORR is that it can be assessed earlier than PFS/OS, and in smaller samples. In general, we will assume that the response rate follows the binomial distribution, so naturally we will consider the ORR as a binomial response rate, and the Clopper-Pearson method is frequently used to estimate the two-sided 95% confidence interval (CI). If you would like to control the confounding factors in the stratified study design, the Cochran-Mantel-Haenszel (CMH) test provides a solution to address these needs.

How about the odds ratio (OR)? It is a measure of the association between an exposure and an outcome. So it can be regarded as the odds of the outcome occurring in a particular exposure compared to the odds in the absence of that exposure. Thus, we can use it to assess the ORR between the treatment and control groups in RCT trials in combination with a 95% binomial response rate as presented in reports. More details can be found in Explaining Odds Ratios.

ORR and OR in SAS

Firstly, let's see how to use proc freq in SAS to obtain the ORR rate with Clopper-Pearson (Exact) CI and the odds ratio with and without stratification. Imaging we have an example of data with columns like TRTPN(1/2), ORR(1 for subjects with ORR and 0 without ORR), Strata1(A/B) and the count number.

data dat;
    input TRTPN ORR Strata1 $ Count @@;
1 1 A 8  1 1 B 12
1 2 A 17 1 2 B 13
2 1 A 13 2 1 B 9
2 2 A 20 2 2 B 8

Then use the tables statement with binomial to compute the CI of ORR. The level="1" binomial option can help you compute the proportion for subjects with events, which means the CI corresponds to the ORR event. And the exact biomial can compute the Clopper-Pearson CI as you need.

ods listing close;
proc freq data=dat;
    by trtpn;
    weight count/zeros;
    tables orr/binomial(level="1") alpha=0.05;
    exact binomial;
    ods output binomial=orrci;
ods listing;

Before the stratification analysis, let's see the common odds ratio without any stratified factors. The option chisq requests chi-square tests and measurements, and relrisk displays the odds ratio and relative risk with asymptotic Wald CI by default.

ods listing close;
proc freq data=dat;
    weight count/zeros;
    tables TRTPN*ORR /chisq relrisk;
    ods output FishersExact=pval RelativeRisks=ci;
ods listing;

And then let's see how to use CMH as the statistical method in proc freq to obtain the association statistics, p-value of Cochran-Mantel-Haenszel test, adjusted odds ratio by Strata1 variable and corresponding CI.

ods listing close;
proc freq data=dat;
    weight count/zeros;
    tables Strata1*TRTPN*ORR /cmh;
    ods output cmh=cmhpval CommonRelRisks=cmhci; 

Now that we have seen the example of the proc freq used to compute the odds ratio with and without stratification, let's have a look at how to use the logistic regression proc logistic to do it.

proc logistic data=dat;
    weight count;
    class TRTPN / param=ref ref=last;
    model ORR(event='1')=TRTPN;

And the stratification analysis by logistic as shown below.

proc logistic data=dat;
    freq count;
    class TRTPN Strata1 / param=ref ref=last;
    strata Strata1;
    model ORR(event='1')=TRTPN;

However, we can see there is a little difference between proc freq and the logistic regression method of odds ratio. The same condition occurs in R as well.

ORR and OR in R

Now let's jump into the R section, how can we handle the same analysis in R?

First of all, I want to recommend the tern R package, which focuses on clinical statistical analysis and provides serveral helpful functions. More details can be found in the tern package document.

I create an example data set similar to the one shown above, which includes the same columns but is not the counted table. The columns of strata1 - strata3 represent three stratified factors.

dta <- data.frame(
  orr = sample(c(1, 0), 100, TRUE),
  trtpn = factor(rep(c(1, 2), each = 50), levels = c(2, 1)),
  strata1 = factor(sample(c("A", "B"), 100, TRUE)),
  strata2 = factor(sample(c("C", "D"), 100, TRUE)),
  strata3 = factor(sample(c("E", "F"), 100, TRUE))

Then you can use BinomCI function to compute the CI of ORR and BinomDiffCI function to compute the CI of difference ORR in two treatments.

dta %>% count(trtpn, orr)

##   trtpn orr  n
## 1     2   0 28
## 2     2   1 22
## 3     1   0 30
## 4     1   1 20    

DescTools::BinomCI(x = 20, n = 50, method = "clopper-pearson")

##      est    lwr.ci   upr.ci
## [1,] 0.4 0.2640784 0.548206

DescTools::BinomDiffCI(20, 50, 22, 50, method=c("wald"))

##        est     lwr.ci    upr.ci
## [1,] -0.04 -0.2333125 0.1533125

Regarding the unstratification analysis of odds ratio, we can use DescTools::OddsRatio() function, or logistic regression using glm() with logit link. Below is the code to get the odds ratio and corresponding Wald CI using OddsRatio() function.

DescTools::OddsRatio(matrix(c(20, 22, 30, 28), nrow = 2, byrow = TRUE),
  method = "wald", conf.level = 0.95

## odds ratio     lwr.ci     upr.ci 
##  0.8484848  0.3831831  1.8788054 

And the glm() function also can get the same results as shwon below.

fit <- glm(orr ~ trtpn, data = dta, family = binomial(link = "logit"))
exp(cbind(Odds_Ratio = coef(fit), confint(fit)))

##             Odds_Ratio     2.5 %   97.5 %
## (Intercept)  0.7857143 0.4450719 1.369724
## trtpn1       0.8484848 0.3811997 1.879735

Regarding the unstratification analysis of odds ratio, there are two ways that I have found for computing it. One is Cochran-Mantel-Haenszel chi-squared test using mantelhaen.test() function, and another is conditional logistic regression survival::clogit() function with strata usage for stratification analysis. Let's have a look at the specific steps.

Assuming that we want to consider three stratified factors in our CMH test, we'd better to pre-process data properly before we pass on to mantelhaen.test function. Because this function has certain requirement for the input data format.

# pre-process
df <- dta %>% count(trtpn, orr, strata1, strata2, strata3)
tab <- xtabs(n ~ trtpn + orr + strata1 + strata2 + strata3, data = df)
tb <- as.table(array(c(tab), dim = c(2, 2, 2 * 2 * 2)))
# CMH analysis
mantelhaen.test(tb, correct = FALSE)

## Mantel-Haenszel chi-squared test without continuity correction
## data:  tb
## Mantel-Haenszel X-squared = 0.40574, df = 1, p-value = 0.5241
## alternative hypothesis: true common odds ratio is not equal to 1
## 95 percent confidence interval:
##  0.3376522 1.7320849
## sample estimates:
## common odds ratio 
##         0.7647498

PS. If we only use one stratification like strata1, the same result as SAS proc freq we can get here. Besides you can also use vcdExtra::CMHtest to compute the p-value of CMH, but if you want to obtain the same p-value used in SAS, a modification has to be made to the vcdExtra library. Refer to this github issue: https://github.com/friendly/vcdExtra/issues/3.

And then how to implement it using conditional logistic regression, just add the strata in the formula.

fit <- clogit(formula = orr ~ trtpn + strata(strata1, strata2, strata3), data = dta)
exp(cbind(Odds_Ratio = coef(fit), confint(fit)))

##        Odds_Ratio    2.5 %   97.5 %
## trtpn1  0.7592608 0.335024 1.720704


Above all, here is my brief summary for the statisical analysis of ORR and odds ratio in R and SAS. And CMH is also a widely used method to test the association between treatment and binary outcome when you want to consider the stratification factors. Lastly, a question remain unanswered: why do we obtain different results from the logistic regression compared to the CMH test when we apply them to compute the the stratified odds ratio. I'm looking for how to respond to it.


Introduction to tern Calculation of Cochran–Mantel–Haenszel Statistics for Objective Response and Clinical Benefit Rates and the Effects of Stratification Factors
Estimating Binomial Proportion Confidence Interval with Zero Frequency Response using FREQ Procedure
The path less trodden - PROC FREQ for ODDS RATIO
R: How to Calculate Odds Ratios in Logistic Regression Model