Selecting N rows from a dataset or K-th element from a macro variable is a common data manipulation process. This post is listing the ways how to resolve these questions.
First or Last N Rows
proc sql method,
outobs both can be used to select N rows from a dataset, but worth noting that it will cause differences if you join tables.
proc sql inobs = 5 /*outobs=5*/; create table cls as select * from sashelp.class; quit;
For the SAS code instead of
proc sql, the most straightforward method is to utilize the
obs that is very similar to the
sql method. Otherwise if you would like to select a range of rows, just add another parameter
data raw_o; set sashelp.class(firstobs = 5 obs = 10); run;
_N_ variable with the
IF-ELSE statement to reach this purpose I suppose is more flexible sometimes.
data raw_1; set sashelp.class; if 5<= _N_ <=10 then output; run;
So how about selecting the last rows? It seems we have to know the total number of rows, and then utilize the
_N_ variable once.
data raw_2; set sashelp.class; if &n_rows.-4<=_N_<=&n_rows. then output; run;
BTW how to select N observations randomly, we can use the
proc surveyselect procedure and define
method = srs as the simple random selection method, so that we get the random 5 rows from this dataset.
proc surveyselect data = sashelp.class out = rd_class method = srs sampsize = 5 seed = 123456; run;
Besides for the row number, we can also add a new row number by group, as shown the following example:
proc sort data = sashelp.class out = sorted_class; by age; run; data sorted_class_2; set sorted_class; by age; if first.age then new_row_number=0; new_row_number+1; run;
K-th Elements from Macro Variable
First off, we would want to create a macro list to store information. If we just want to store a single value, as following:
proc sql; select count(name) into: n_name trimmed from sashelp.class; quit; %put &n_name;
Storing multiple values is also very similar.
proc sql; select count(name),mean(height) format=10.2 into: n_name trimmed, :mean_height trimmed from sashelp.class; quit; %put &n_name &mean_height;
Or just simply want the list values to be assigned to a list of macro variables.
proc sql; select distinct(name) into: n1-:n19 from sashelp.class; quit;
But the above example seems like you have to know the total number of distinct values in the dataset. So maybe the common way is to store the column values in a list separated by any delimiter you want.
proc sql; select distinct(name) into: nameList separated by ' ' from sashelp.class; /* %let numNames = &sqlobs;*/ quit; %put &nameList.;
Supposed that I just want the second element in this macro variable, what should I do? Maybe the
%scan function is enough to reach our purpose.
Obviously, it seems not as convenient to extract the element as
nameList in R, but it is enough to use in SAS.
Another way is to loop the macro variable by
%do, as below to assign each element to a new column.
%let cntName = %sysfunc(countw(&nameList)); data raw; array names[&cntName] $200 name1-name&cntName; do i=1 to &cntName; names[i]=scan("&nameList", i); end; drop i; run;
Otherwise there are still many great documents that have been posted to show how to store and manipulate lists in SAS, like Choosing the Best Way to Store and Manipulate Lists in SAS
How to use the SAS SCAN Function?
Creating Lists! Using the Powerful INTO Clause with PROC SQL to Store Information in Macro Lists
How to Select the First N Rows in SAS
Using SAS® Macro Variable Lists to Create Dynamic Data-Driven Programs
Storing and Using a List of Values in a Macro Variable
Please indicate the source: http://www.bioinfo-scrounger.com