# KeepNotes blog

Stay hungry, Stay Foolish.

0%

This post is talking about how to display descriptive statistics for variables quickly. In the sense that we would like to know an usual and agile way to accomplish it in SAS.

The following examples show how to resolve the below questions (just very simple but quite common):

• How to count distinct values
• How to count variables by group
• How to produce the frequency table of variables
• How to calculate the statistics for variables

In R, it seems like using `Hmisc::describe` is available, but not the only function, other external packages or `base` functions like `summary` can also be utilized very well.

#### Count Values or Distinct Values

Here we use the `proc sql` procedure with the SAS dataset called BirthWgt, to count the `Race` variable.

``````proc sql;
select count(Race) as cnt_race
from sashelp.BirthWgt;
run;``````

But I feel just count the total number of `Race` variable is not make sense. If we would like to count the `Married` variables grouped by the `Race` variable:

``````proc sql;
select Race, count(Married) as cnt_married
from sashelp.BirthWgt
group by Race;
run;``````

If you want to count the distinct value, add the `distinct` in the `count` function.

``````proc sql;
select count(distinct Married) as distinct_married
from sashelp.BirthWgt;
run;``````

#### Frequency Table

We can use `proc freq` to create frequency tables for one or more variables. Such as the example for the `SomeCollege` variable with missing values, sorted by `Race` and define the output as `result` dataset including cumulative frequencies and percentages.

``````proc sort data = sashelp.BirthWgt;
by Race;
run;

proc freq data=sashelp.BirthWgt;
tables SomeCollege /out=result missing outcum;
by Race;
run;``````

BTW if you add a statistical argument like `chisq`, the result becomes the statistics for the Chi-Square Tests.

#### Descriptive Statistics

Otherwise we can use `proc tabulate` to create a table for displaying multiple statistics quickly.

``````proc tabulate data = sashelp.cars;
var weight;
table weight * (N Min Q1 Median Mean Q3 Max);
run;``````

But I think `proc means` is more convenient to save the output like:

``````proc means data = sashelp.cars n nmiss mean std median p25 p75 min max;
var weight;
output out=weight_tbl n=n nmiss=nmiss mean=mean std=std median=median p25=p25 p75=p75 min=min max=max;
run;``````