# KeepNotes blog

Stay hungry, Stay Foolish.

0%

This is reference to the "Chapter 3 Statistical and mathematical functions", "Chapter 4 Programming and operating system interface" and "Chapter 5 Common statistical procedures" in <SAS and R: Data Management, Statistical Analysis, and Graphics (second edition)>.

#### Probability density function

``````# R code
y = pnorm(1.96, mean=0, sd=1)

# SAS code
data normal;
y = cdf("NORMAL", 1.96, 0, 1);
run;``````

#### Setting the random number seed

``````# R code
set.seed(12345)

# SAS code
call streaminit(12345);``````

#### Normal random variables

``````# R code
rnorm(10)

# SAS code
data rand;
call streaminit(12345);
do i=1 to 10;
x=rand("normal", 0, 1);
output;
end;
run;``````

#### Integer functions

``````data sign;
nextintx = ceil(3.49);
justintx = floor(3.49);
roundx = round(3.49, 0.1);
roundint = round(3.49, 0.01);
movetozero = int(3.49);
run;``````

#### Looping

``````# R code
x <- numeric(10)
for (i in 1:10){
x[i] <- rnorm(1)
}

# SAS code
data;
do i = 1 to 10;
x = normal(0);
output;
end;
run;``````

#### Sequence of values or patterns

``````data ds;
do x = 1 to 9 by 2;
output;
end;
run;``````

#### Grid of values

``````# R code
data.frame(x1 = rep(1:3, each=2), x2 = rep(c("M","F"), time=3))

# SAS code
data ds;
do x1 = 1 to 3;
do x2 = "M","F";
output;
end;
end;
run;``````

#### Summary statistics

SAS的summary分析总让我觉得SAS是一个不算编程语言，只能说其是一个分析工具。。。在summary分析中SAS很“贴心”将多个计算函数（如mean, stdev, max, min等等）放在某个proc中，但是这样会让整个编程语言缺少灵活性，变得非常的“死板”。。让其理念跟 其他编程语言完全不一样了，总觉得怪怪的。。。这里吐槽下。。。

Using proc means for summary statistics

``````/*proc means contains printed output and data output*/
proc means data=sashelp.iris N mean stddev max min;
class Species;
var PetalLength PetalWidth SepalLength SepalWidth;
output out=ds;
run;``````

Using proc univariate for detailed summary statistics（注：需要用ods输出结果）

``````ods output BasicMeasures=ss;
/*ods trace on/listing;*/
proc univariate data=sashelp.iris all;
class Species;
var PetalLength PetalWidth SepalLength SepalWidth;
run;
/*ods trace off;*/``````

#### Calculating Percentiles

``````ods output Quantiles=qt;
proc univariate data=sashelp.iris all;
var PetalLength PetalWidth SepalLength SepalWidth;
run;``````

``````proc univariate data=sashelp.iris;
class Species;
var PetalLength;
output out=iris_percentile
pctlpts = 0,25,50,75,95,100
pctlpre = P_;
run;``````

``````proc means data=sashelp.iris p25 p50 p75 p95;
class Species;
var PetalLength;
output out=perc
p25=p_25
p50=p_50
p75=p_75
p95=p_95;
run;``````

``quantile(c(1:10), c(0.25,0.5,0.75,0.95))``

``quantile(c(1:10), c(0.25,0.5,0.75,0.95), type = 3)``

#### Centering, normalizing, and scaling

``````proc standard data=sashelp.iris out=iris2 mean=0 std=1;
var PetalLength PetalWidth;
run;``````

``scale(iris\$Sepal.Length)``

#### Mean and 95% confidence interval

``````proc means data=sashelp.iris lclm mean uclm;
var PetalLength;
run;``````

``t.test(iris\$Sepal.Length)\$conf.int``

#### Contingency tables

``````data dumy;
input x y @@;
datalines;
0 1 1 0 1 1
0 1 1 1 1 0
1 1 1 1 0 0
1 0 0 0 0 1
run;

proc freq data=dumy;
tables x*y / out=freqtable nopercent nocol norow;
run;``````

``````proc freq data=dumy;
tables x*y / chisq relrisk;
run;``````

``````proc freq data=dumy;
tables x*y / agree;
run;``````

#### Correlation

``````# R code
cor.test(iris\$Sepal.Length, iris\$Sepal.Width)

# SAS code
proc corr data=sashelp.iris;
var PetalLength PetalWidth;
run;``````

#### Tests for normality

``````proc univariate data=sashelp.iris normal;
var PetalLength;
run;``````

#### Student's t test

T检验，SAS支持组间T检验通过一个分类变量一个对应值的形式；**注：*结果中会输出方差齐性和不齐性两种结果**

``````data scores;
input Gender \$ Score @@;
datalines;
f 75  f 76  f 80  f 77  f 80  f 77  f 73
m 82  m 80  m 85  m 85  m 78  m 87  m 82
;
run;

proc ttest data=scores;
class Gender;
var Score;
run;``````

#### Nonparametric tests

``````proc npar1way data=scores wilcoxon edf;
class Gender;
var Score;
run;``````

#### Logrank test

Logrank test在Kaplan-Meier plot和Cox proportional hazards model中比较常见；在SAS中可以用`lifetest`

``````proc lifetest data=sashelp.BMT plots=survival(atrisk=0 to 2500 by 500);
time T * Status(0);