Fit skewed normal distribution to data in R

偏态分布(Skewed distribution)是指频数分布不对称,集中位置偏向一侧。若集中位置偏向数值小的一侧,称为正偏态分布;集中位置偏向数值大的一侧,称为负偏态分布。
偏态分布只有满足一定的条件(如样本例数够大等)才可以看做近似正态分布

对于偏态分布的资料可看:
The Skew-Normal Probability Distribution

结合网上资料,偏态分布的参数估计和拟合在R中常用的有sn包和fGarch

Simulating data

fGarch包的rsorm函数来生成一个Skew Normal Distribution,xi为偏态参数;或者用sn包的dsn函数

library(sn)
library(fGarch)

# sn
data <- rsn(1000, xi=-5, omega=2, alpha=5)
# fGarch
data <- rsnorm(1000, mean = 0, sd = 1, xi = 1.5)

Parameter estimation

sn包为例,用selm函数拟合skewed distribution并绘制曲线

set.seed(123)
data <- rsn(1000, xi=-5, omega=2, alpha=5)
# Fit curve
fit <- selm(data ~ 1, family = "SN")
plot(fit)

拟合估计的参数:

> fit@param$dp
        xi      omega      alpha 
 0.3207890  1.0157198 -0.4102392 

除了用上述base函数plot绘图外,还可以用dsn函数来生成密度分布

ggplot(data = data.frame(x = data), aes(x = x)) +
  geom_histogram(aes(y = ..density..), colour = "black", fill = "white") +
  stat_function(fun = dsn, args = c(omega = 2.02, alpha = 5.05, xi = -5.04))

skewed_distribution

fGarch包为例,用snormFit函数拟合skewed distribution并绘制曲线

set.seed(123)
data <- rsnorm(1000, mean = 0, sd = 1, xi = 1.5)
# Fit curve
snormFit(data)

拟合估计的参数:

> snormFit(data)$par
       mean          sd          xi 
-0.01508664  1.00081242  1.61583737 

dsnorm函数来生成密度分布

ggplot(data = data.frame(x = data), aes(x = x)) +
  geom_histogram(aes(y = ..density..), colour = "black", fill = "white") +
  stat_function(fun = dsnorm, args = c(mean = -0.015, sd = 1.00, xi = 1.61))

skewed_distribution2

如果以同一个模拟数据来看下snfGarch两者的区别

set.seed(123)
data <- rsnorm(1000, mean = 0, sd = 1, xi = 1.5)
fit <- selm(data ~ 1, family = "SN")
fit@param$dp
snormFit(data)$par

ggplot(data = data.frame(x = data), aes(x = x)) +
  geom_histogram(aes(y = ..density..), colour = "black", fill = "white") +
  stat_function(mapping = aes(colour = "fGarch"), size = 1.2, 
                fun = dsnorm, args = c(mean = as.numeric(snormFit(data)$par["mean"]), 
                                       sd = as.numeric(snormFit(data)$par["sd"]), 
                                       xi = as.numeric(snormFit(data)$par["xi"]))) +
  stat_function(mapping = aes(colour = "sn"), size = 1.2, 
                fun = dsn, args = c(omega = as.numeric(fit@param$dp["omega"]), 
                                    alpha = as.numeric(fit@param$dp["alpha"]), 
                                    xi = as.numeric(fit@param$dp["xi"]))) +
  scale_colour_manual(name = NULL, values = c("blue", "red"))

skewed_distribution3

参考资料

An introduction to the R package sn
Skew Normal Distribution and Parameter Estimation
How can we create right/left skewed normal distribution curve in R ?

本文出自于http://www.bioinfo-scrounger.com转载请注明出处

文章目录
  1. 1. Simulating data
  2. 2. Parameter estimation
  3. 3. 参考资料
|