SOME UNDERUTILIZED SAS
PROCEDURES
Part 1
Objective : The objective of this post is to make the readers aware of the
capabilities of some unused and some used but underutilized procedures which
can be of great help in day to day programming. The objective is not to give
exhaustive knowledge about these procedures but to attract readers’ attention
to these procs by interesting examples.
Theory: SAS has a long and ever-growing
list of procedures, I have covered just a few of them which are generally not
used much but are very useful.
Procedures:
1) Proc RANK
This useful procedure is highly underutilized as it does a small job of
assigning ranks.
The RANK procedure computes ranks for one or more numeric variables
across the observations of a SAS data set and outputs the ranks to a new SAS
data set. The syntax of the procedure is :
PROC
RANK <option(s)>;
BY <DESCENDING>
variable-1
<…<DESCENDING> variable-n>
<NOTSORTED>;
VAR data-set-variables(s);
RANKS new-variables(s);
By – Calculates ranks separately for each by group.
Ranks – Identifes the variable containing ranks. If omitted then
original variable is replaced with ranks.
Var – Specifies the variable to be ranked.
There are different types of ranking methods that can be used. For full
details about each ranking type read SAS Procedures guide.
This function also calculates statistical ranks.
Note :-
One question frequently asked in interviews is, I want to select the third
highest salary from employees. General answers are sort and select 3rd
row etc etc.. but proc rank provides an elegent way of doing it.
data
employees;
input id salary;
datalines;
1 2000
2 3000
3 1000
4 5000
5 1500
6 1200
;
run;
proc
rank data=employees out=salaries(where=(rank=3)) descending;
var salary;
ranks rank;
run;
So now you can select any highest or lowest salary just by changing the
number in where condition.
2) Proc COMPARE :
This
procedure compares the contents of two sas datasets or selected variables from
the same dataset.
Proc
compare is generally used for validation of datasets created. For example- I
create an inventory dataset for current month which is to be appended to the
master inventory dataset. So it is good to do a Proc compare on two datasets
excluding the observation so that you can get the differences in formats labels
data types etc. before appending.
PROC
COMPARE generates the following information about the two data sets that are
being compared:
- Whether matching variables have
different values
- Whether one data set has more
observations than the other
- What variables the two data sets have in
common
- How many variables are in one data set
but not in the other
- Whether matching variables have
different formats, labels, or types.
- A comparison of the values of matching
observations.
The
NOVALUES option suppresses the part of the output that shows the differences in
the values of matching variables
Syntax :
PROC
COMPARE base= compare=
<option(s)>;
BY <DESCENDING>
variable-1 <…<DESCENDING> variable-n> <NOTSORTED>;
ID <DESCENDING>
variable-1 <…<DESCENDING> variable-n> <NOTSORTED>;
VAR variable(s);
WITH variable(s);
Run;
By –
Produces a separate comparison for each by group.
ID -
Identify variables to use to match observations.
Var -
Restrict the comparison to values of specific variables.
With –
Used to compare variales of different names or two variables from same dataset.
3) Proc CONTENTS:
Syntax :
Proc
contents data=dset <out=dset1>;
Run;
The
contents procedure is one of the most popular procedures in SAS. Generally this
procedure is used to see the contents of a dataset in a listing output to check
variable formats, informats, labels etc.
But there
is another way to use this procedure, by giving an out= option which stores the
comparison results in a output dataset. This dataset contains valuable
information about the dataset on which the contents is applied.
This
dataset contains very valuable information about the dataset (variables, types,
formats, informats, labels etc.), Also it contains few very useful information
like sorted by which can be used.
This
output dataset can be utilized in automating repetitive processes For e.g. you
need to report all datasets in a library and list the variable(s) it is sorted
by, then this dataset can be used.
4) Proc COPY :
Syntax :
PROC
COPY OUT=libref-1 IN=libref-2 <CLONE|NOCLONE> <CONSTRAINT=YES|NO>
<DATECOPY> <INDEX=YES|NO> <MEMTYPE=(mtype-1 <...mtype-n>)>
MOVE <ALTER=alter-password>>;
EXCLUDE SAS-file-1 <...SAS-file-n>
</ MEMTYPE=mtype>;
SELECT SAS-file-1 <...SAS-file-n>
</ <MEMTYPE=mtype>
<ALTER=alter-password>>;
Run;
The COPY
procedure copies one or more SAS files from a SAS library. Generally, the COPY
procedure functions the same as the COPY statement in the DATASETS procedure.
The two differences are as follows:
- The IN= argument is required with PROC
COPY. In the COPY statement, IN= is optional. If IN= is omitted, the
default value is the libref of the procedure input library.
- PROC DATASETS cannot work with libraries
that allow only sequential data access.
The COPY
procedure, along with the XPORT engine and the XML engine, can create and read
transport files that can be moved from one host to another. PROC COPY can
create transport files only with SAS data sets, not with catalogs or other
types of SAS files.
Transporting
is a three-step process:
1 Use
PROC COPY to copy one or more SAS data sets to a file that is created with
either the transport (XPORT) engine or the XML engine. This file is referred to
as a transport file and is always a sequential file.
2 After
the file is created, you can move it to another operating environment via
communications software, such as FTP, or tape. If you use communications
software, be sure to move the file in binary format to avoid any type of
conversion. If you are moving the file to a mainframe, the file must have
certain attributes.
3 After
you have successfully moved the file to the receiving host, use PROC COPY to
copy the data sets from the transport file to a SAS library.
5) Proc DATASETS :
Proc
datasets is a multipurpose procedure designed to accomplish various tasks for
managing your sas files. Proc datasets can do the following:
- copy SAS files from one SAS library to
another
- rename SAS files
- repair SAS files
- delete SAS files
- list the SAS files that are contained in
a SAS library
- list the attributes of a SAS data set,
such as:
- the date when the data was last modified
- whether the data is compressed
- whether the data is indexed
- he DATASETS Procedure Sample
PROC DATASETS Output 289
- manipulate passwords on SAS files
- append SAS data sets
- modify attributes of SAS data sets and
variables within the data sets
- create and delete indexes on SAS data
sets
- create and manage audit files for SAS
data sets
- create and delete integrity constraints
on SAS data sets
Syntax:
The
syntax of proc datasets is mind-boggling and is no use writing (see syntax in
SAS 9.2 procedures guide) as proc datasets support a variety of tasks, I’ll list a
few of them which are used frequently.
One good
use of proc datasets is to delete all or selected datasets from a library.
proc
datasets lib=work kill memtype=data;
run;
quit;
Other
important statements are:
APPEND - adds observations
from one data set to another. This is most useful when the base file is large and
a small file needs to be added.
CHANGE – changes the name of
a SAS file in the input data library.
COPY – copies some or all members of one SAS library to another. This is
primarily used to move datasets from one system or version to another.
Conclusion : This post aims at creating awareness on some procedures which are utilized
to a very small part of their capability or procedures which are very handy for
doing a small job but are unpopular.
This is the first post in procedures series, more will follow soon.
Will be back with some more magic of SAS. Till then Goodbye.
Saurabh
Singh Chauhan
(er.chauhansaurabh@gmail.com)
Note: Comments and suggestions are always welcome
References :
SAS 9.2 procedures guide by SAS Institute.
Disclaimer :
SAS® and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc.in the USA and other countries. ® Indicates USA registration.
Other brand and product names are registered trademarks or trademarks of their respective companies. The contents of this post are the works of the author(s)and do not necessarily represent the opinions,recommendations, or practices of any organization whatsoever.
SAS® and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc.in the USA and other countries. ® Indicates USA registration.
Other brand and product names are registered trademarks or trademarks of their respective companies. The contents of this post are the works of the author(s)and do not necessarily represent the opinions,recommendations, or practices of any organization whatsoever.
Bohut hi achcha post hai.. keep writing dude..
ReplyDeleteThanks Yaar...i'll make sure i won't stop... :-)
ReplyDeleteGood One..keep writing..
ReplyDeleteAnand dada will also make sure u won't stop :-)
ReplyDelete