0%

R - Add a blank row after each group

I'm a R-lover and believe that anything SAS can do, R can do better. As R is such a powerful language for statistical analysis in clinical trials. Once, I posted an article that said how to insert blank rows, so I looked up how to do that in R.

To reach this purpose, we just need to take two steps:

  • Split the data frame by group.
  • Add a blank row.

The idea is extremly clear and similar to the SAS process. Here, let's see how to complete these two steps.

Firstly, I create test data like:

library(tidyverse)

data <- iris %>% group_by(Species) %>%
  slice_head(n = 3) %>%
  select(Species, everything())

> data
# A tibble: 9 × 5
# Groups:   Species [3]
  Species    Sepal.Length Sepal.Width Petal.Length Petal.Width
  <fct>             <dbl>       <dbl>        <dbl>       <dbl>
1 setosa              5.1         3.5          1.4         0.2
2 setosa              4.9         3            1.4         0.2
3 setosa              4.7         3.2          1.3         0.2
4 versicolor          7           3.2          4.7         1.4
5 versicolor          6.4         3.2          4.5         1.5
6 versicolor          6.9         3.1          4.9         1.5
7 virginica           6.3         3.3          6           2.5
8 virginica           5.8         2.7          5.1         1.9
9 virginica           7.1         3            5.9         2.1

Now I'd like to insert rows between each Species, which would mean inserting a row between 3-4 rows and 6-7 rows. So we need to use the group_split function to split data by the Species variable.

data %>% group_split(Species)

And then we can find that the output class is a list, so the next step we should do is convert this list class to a dataframe with blank rows. We can now use the functional programming tool purrr, which has a map function map_dfr to deal with this. It applies a function(here is the add_row) to each element of the list.

data %>% group_split(Species) %>%
  map_dfr(~add_row(.x, .after = Inf))

# A tibble: 12 × 5
   Species    Sepal.Length Sepal.Width Petal.Length Petal.Width
   <fct>             <dbl>       <dbl>        <dbl>       <dbl>
 1 setosa              5.1         3.5          1.4         0.2
 2 setosa              4.9         3            1.4         0.2
 3 setosa              4.7         3.2          1.3         0.2
 4 NA                 NA          NA           NA          NA  
 5 versicolor          7           3.2          4.7         1.4
 6 versicolor          6.4         3.2          4.5         1.5
 7 versicolor          6.9         3.1          4.9         1.5
 8 NA                 NA          NA           NA          NA  
 9 virginica           6.3         3.3          6           2.5
10 virginica           5.8         2.7          5.1         1.9
11 virginica           7.1         3            5.9         2.1
12 NA                 NA          NA           NA          NA  

The above output is what I expected. And I feel the R programming is more brief and clear than SAS, do you think so?

Reference

Insert a Blank Row After Each Group of Data