Adding summary information to a density plot created with ggplot
I have a density plot and I would like to add some summary information such as placing a line at the median and shading the 90% credible intervals (5th and 95th quantiles). Is there a way to do this in ggplot?
This is the type of plot that I would like to summarize:
I can figure out how to draw a line from the y=0 to y= density(median(x)), but it is not clear to me if I can shade the plot with a 90% CI. Alter开发者_运维知识库natively, I could add a horizontal boxplot above the density plot, but it is not clear how to rotate the boxplot by itself, without rotating the density plot along with it.
x <- as.vector(rnorm(10000))
d <- as.data.frame(x=x)
library(ggplot2)
ggplot(data = d) + theme_bw() +
geom_density(aes(x=x, y = ..density..), color = 'black')
You can use the geom_area() function. First make the density explicit using the density() function.
x <- as.vector(rnorm(10000))
d <- as.data.frame(x=x)
library(ggplot2)
p <- ggplot(data = d) + theme_bw() +
geom_density(aes(x=x, y = ..density..), color = 'black')
# new code is below
q5 <- quantile(x,.05)
q95 <- quantile(x,.95)
medx <- median(x)
x.dens <- density(x)
df.dens <- data.frame(x = x.dens$x, y = x.dens$y)
p + geom_area(data = subset(df.dens, x >= q5 & x <= q95),
aes(x=x,y=y), fill = 'blue') +
geom_vline(xintercept = medx)
I wanted to add to @Prasad Chalasani's answer for those like myself that came looking to add all 3 Std areas. 1 Std is the darkest shade, 2 Std is the middle shade, and 3 Std is the lightest shade. The mean is the black line and the median is the white line.
set.seed(501) # Make random sample reproducible
x <- as.vector(rnorm(100))
d <- as.data.frame(x=x)
library(ggplot2)
p <- ggplot(data=d) +
theme_bw() +
geom_density(aes(x=x, y = ..density..), color = '#619CFF')
# new code is below
q15.9 <- quantile(x, .159) # 1 Std 68.2%
q84.1 <- quantile(x, .841)
q2.3 <- quantile(x, .023) # 2 Std 95.4%
q97.7 <- quantile(x, .977)
q0.01 <- quantile(x, .001) # 3 Std 99.8%
q99.9 <- quantile(x, .999)
meanx <- mean(x)
medx <- median(x)
x.dens <- density(x)
df.dens <- data.frame(x=x.dens$x, y=x.dens$y)
p + geom_area(data = subset(df.dens, x >= q15.9 & x <= q84.1), # 1 Std 68.2%
aes(x=x,y=y), fill='#619CFF', alpha=0.8) +
geom_area(data = subset(df.dens, x >= q2.3 & x <= q97.7), # 2 Std 95.4%
aes(x=x,y=y), fill='#619CFF', alpha=0.6) +
geom_area(data = subset(df.dens, x >= q0.01 & x <= q99.9), # 3 Std 99.8%
aes(x=x,y=y), fill='#619CFF', alpha=0.3) +
geom_vline(xintercept=meanx) +
geom_vline(xintercept=medx, color='#FFFFFF')
This (also) does the vertical line at the median:
ggplot(data = d) + theme_bw() +
geom_density(aes(x=x, y = ..density..), color = 'black') +
geom_line(aes(x=median(x), y=c(0,.4) ) )
精彩评论