SAS/Descriptive Statistics

< SAS

List and describe your data

edit

Describe your data :

The proc contents returns the list and the type of all variables in the datasets.

 proc contents data= lib.data ;
 title "Describe the content of a database";
 run;

List your data in the output :

The proc print prints the data in the output window. The firstobs option gives the first line to be printed and the obs option the number of lines to print.

 proc print data= lib.data (firstobs=30 obs=40);
 title "Partial Listing";
 run;

Discrete Variables

edit
 proc freq data=lib_name.data_name;
 tables x1 x2 ;
 title "frequence table";
 run;
  • weight specify weights
 proc freq data=lib_name.data_name;
 weight extri;
 tables x1 / out=temp4 outexpect;
 run;

Contingency Tables

edit
 proc freq data=lib_name.data_name;
 tables x1*x2 ;
 title "contingency table";
 run;

Continuous Variables

edit

proc means presents descriptive statistics for each variable listed in the var statement or for each numeric variable in the data set if there is no var statement. Here are some of the keywords that can be used to tell SAS which statistics you wish to see.

  • n : count of non missing variables
  • sum : summation of the variable
  • range : largest value minus smallest value
  • mean : average
  • var : variance
  • stddev : standard deviation
 proc means data= libdata n sum range mean var stddev ;
 var x1 x2;
 run;

The class statement makes statistics for each group of the categorical variable in the class statement. The weights statement weights the observations.

 proc means data=lib_name.data_name;
 var x1 x2;
 class sexe;
 weight extri;
 run;

The proc univariate gives more options. It also returns the quantiles. There is also an histogram statement which can be useful.

 proc univariate data=lib_name.data_name;
 var x1;
 histogram / normal(color=red mu=0 sigma=0.045) kernel(color=blue);
 title "Proc Univariate";
 run;

Kernel and Histograms

edit

If you want to do a kernel or an histogram, you can use proc univariate with the histogram statement or the proc capability.

 proc univariate data=lib_name.data_name;
 var x1;
 histogram / normal(color=red mu=0 sigma=0.045) kernel(color=blue);
 title "Proc Univariate";
 run;

Proc capability :

 proc capability data=lib_name.data_name;
 histogram x1 / normal(color=red mu=0 sigma=0.045)
 kernel(color=blue);
 title "Proc Capability";
 run;


Correlations and scatterplots

edit
 proc corr data=lib_name.data_name;
 var x1 x2 x3;
 weight extri;
 title3 "Linear correlation";
 run;

T Test

edit

The following code test the assumption that the expected value of variable x in the dataset taille is 1.75.

proc ttest data = taille h0=1.75 alpha=0.05;
var x;
run;