偏态分布(Skewed distribution)是指频数分布不对称,集中位置偏向一侧。若集中位置偏向数值小的一侧,称为正偏态分布;集中位置偏向数值大的一侧,称为负偏态分布。 偏态分布只有满足一定的条件(如样本例数够大等)才可以看做近似正态分布
对于偏态分布的资料可看:The Skew-Normal Probability Distribution
结合网上资料,偏态分布的参数估计和拟合在R中常用的有sn
包和fGarch
包
Simulating data
用fGarch
包的rsorm
函数来生成一个Skew Normal Distribution,xi
为偏态参数;或者用sn
包的dsn
函数
library(sn)
library(fGarch)
# sn
data <- rsn(1000, xi=-5, omega=2, alpha=5)
# fGarch
data <- rsnorm(1000, mean = 0, sd = 1, xi = 1.5)
Parameter estimation
以sn
包为例,用selm
函数拟合skewed distribution并绘制曲线
set.seed(123)
data <- rsn(1000, xi=-5, omega=2, alpha=5)
# Fit curve
fit <- selm(data ~ 1, family = "SN")
plot(fit)
拟合估计的参数:
> fit@param$dp
xi omega alpha
0.3207890 1.0157198 -0.4102392
除了用上述base函数plot
绘图外,还可以用dsn
函数来生成密度分布
ggplot(data = data.frame(x = data), aes(x = x)) +
geom_histogram(aes(y = ..density..), colour = "black", fill = "white") +
stat_function(fun = dsn, args = c(omega = 2.02, alpha = 5.05, xi = -5.04))
以fGarch
包为例,用snormFit
函数拟合skewed distribution并绘制曲线
set.seed(123)
data <- rsnorm(1000, mean = 0, sd = 1, xi = 1.5)
# Fit curve
snormFit(data)
拟合估计的参数:
> snormFit(data)$par
mean sd xi
-0.01508664 1.00081242 1.61583737
用dsnorm
函数来生成密度分布
ggplot(data = data.frame(x = data), aes(x = x)) +
geom_histogram(aes(y = ..density..), colour = "black", fill = "white") +
stat_function(fun = dsnorm, args = c(mean = -0.015, sd = 1.00, xi = 1.61))
如果以同一个模拟数据来看下sn
和fGarch
两者的区别
set.seed(123)
data <- rsnorm(1000, mean = 0, sd = 1, xi = 1.5)
fit <- selm(data ~ 1, family = "SN")
fit@param$dp
snormFit(data)$par
ggplot(data = data.frame(x = data), aes(x = x)) +
geom_histogram(aes(y = ..density..), colour = "black", fill = "white") +
stat_function(mapping = aes(colour = "fGarch"), size = 1.2,
fun = dsnorm, args = c(mean = as.numeric(snormFit(data)$par["mean"]),
sd = as.numeric(snormFit(data)$par["sd"]),
xi = as.numeric(snormFit(data)$par["xi"]))) +
stat_function(mapping = aes(colour = "sn"), size = 1.2,
fun = dsn, args = c(omega = as.numeric(fit@param$dp["omega"]),
alpha = as.numeric(fit@param$dp["alpha"]),
xi = as.numeric(fit@param$dp["xi"]))) +
scale_colour_manual(name = NULL, values = c("blue", "red"))
参考资料
An introduction to the R package sn
Skew Normal Distribution and Parameter Estimation
How can we create right/left skewed normal distribution curve in R ?
本文出自于http://www.bioinfo-scrounger.com转载请注明出处