I'm a R-lover and believe that anything SAS can do, R can do better. As R is such a powerful language for statistical analysis in clinical trials. Once, I posted an article that said how to insert blank rows, so I looked up how to do that in R.
To reach this purpose, we just need to take two steps:
- Split the data frame by group.
- Add a blank row.
The idea is extremly clear and similar to the SAS process. Here, let's see how to complete these two steps.
Firstly, I create test data like:
library(tidyverse)
data <- iris %>% group_by(Species) %>%
slice_head(n = 3) %>%
select(Species, everything())
> data
# A tibble: 9 × 5
# Groups: Species [3]
Species Sepal.Length Sepal.Width Petal.Length Petal.Width
<fct> <dbl> <dbl> <dbl> <dbl>
1 setosa 5.1 3.5 1.4 0.2
2 setosa 4.9 3 1.4 0.2
3 setosa 4.7 3.2 1.3 0.2
4 versicolor 7 3.2 4.7 1.4
5 versicolor 6.4 3.2 4.5 1.5
6 versicolor 6.9 3.1 4.9 1.5
7 virginica 6.3 3.3 6 2.5
8 virginica 5.8 2.7 5.1 1.9
9 virginica 7.1 3 5.9 2.1
Now I'd like to insert rows between each Species
, which would mean inserting a row between 3-4 rows and 6-7 rows. So we need to use the group_split
function to split data by the Species
variable.
data %>% group_split(Species)
And then we can find that the output class is a list, so the next step we should do is convert this list class to a dataframe with blank rows. We can now use the functional programming tool purrr
, which has a map function map_dfr
to deal with this. It applies a function(here is the add_row
) to each element of the list.
data %>% group_split(Species) %>%
map_dfr(~add_row(.x, .after = Inf))
# A tibble: 12 × 5
Species Sepal.Length Sepal.Width Petal.Length Petal.Width
<fct> <dbl> <dbl> <dbl> <dbl>
1 setosa 5.1 3.5 1.4 0.2
2 setosa 4.9 3 1.4 0.2
3 setosa 4.7 3.2 1.3 0.2
4 NA NA NA NA NA
5 versicolor 7 3.2 4.7 1.4
6 versicolor 6.4 3.2 4.5 1.5
7 versicolor 6.9 3.1 4.9 1.5
8 NA NA NA NA NA
9 virginica 6.3 3.3 6 2.5
10 virginica 5.8 2.7 5.1 1.9
11 virginica 7.1 3 5.9 2.1
12 NA NA NA NA NA
The above output is what I expected. And I feel the R programming is more brief and clear than SAS, do you think so?