R + ggplot2 - Aggregate Data by Intervals
I have a file where in each line i have a numeric value symbolizing an average duration:
12.3
5.4
6
...
There is some way in R to displ开发者_如何学JAVAay the data in automatic or manual intervals/breaks (aggregate?). Something like this:
[0,1[ 0
[1, 6[ 1
[6, 20[ 2
...
Also, next i want to produce a plot in ggplot2 showing this data. Could i use these intervals as labels?
You can bin data with the cut()
function in base R or use the Hmisc package and cut2()
. There are several options on how to go about cutting and slicing your data, all of which are documented in help(cut)
or help(cut2)
respectively.
Once you have binned your data appropriately, plotting with ggplot becomes a trivial exercise:
library(ggplot2)
#Sample data
set.seed(1)
dat <- data.frame(x = sample(1:100, 1000, TRUE))
dat$cuts <- cut(dat$x, breaks = 5)
#Make bar chart
qplot(dat$cuts)
精彩评论