How to plot data grouped by a factor, but not as a boxplot
In R, given a vector
casp6 <- c(0.9478638, 0.7477657, 0.9742675, 0.9008372, 0.4873001, 0.5097587, 0.6476510, 0.4552577, 0.5578296, 0.5728478, 0.1927945, 0.2624068, 0.2732615)
and a factor:
trans.factor <- factor (rep (c("t0", "t12", "t24", "t72"), c(4,3,3,3)))
I want to create a plot where the data points are grouped as defined by the factor. So the categories should be on the x-axis, values in the same category should have the same x coordinate.
Simply doing plot(trans.factor, casp6)
does almost what I want, it produces a boxplot,开发者_如何学运维 but I want to see the individual data points.
require(ggplot2)
qplot(trans.factor, casp6)
You can do it with ggplot2
, using facets
. When I read "I want to create a plot where the data points are grouped as defined by the factor", the first thing that came to my mind was facets
.
But in this particular case, faster alternative should be:
plot(as.numeric(trans.factor), casp6)
And you can play with plot options afterwards (type
, fg
, bg
...), but I recommend sticking with ggplot2
, since it has much cleaner code, great functionality, you can avoid overplotting... etc. etc.
Learn how to deal with factors. You got barplot when evaluating plot(trans.factor, casp6)
'cause trans.factor
was class of factor
(ironically, you even named it in such manor)... and trans.factor
, as such, was declared before a continuous (numeric) variable within plot()
function... hence plot()
"feels" the need to subset data and draw boxplot based on each part (if you declare continuous variable first, you'll get an ordinary graph, right?). ggplot2
, on the other hand, interprets factor in a different way... as "an ordinary", numeric variable (this stands for syntax provided by Jonathan Chang, you must specify geom
when doing something more complex in ggplot2
).
But, let's presuppose that you have one continuous variable and a factor, and you want to apply histogram on each part of continuous variable, defined by factor levels. This is where the things become complicated with base graph capabilities.
# create dummy data
> set.seed(23)
> x <- rnorm(200, 23, 2.3)
> g <- factor(round(runif(200, 1, 4)))
By using base graphs (package:graphics
):
par(mfrow = c(1, 4))
tapply(x, g, hist)
ggplot2 way:
qplot(x, facets = . ~ g)
Try to do this with graphics
in one line of code (semicolons and custom functions are considered cheating!):
qplot(x, log(x), facets = . ~ g)
Let's hope that I haven't bored you to death, but helped you!
Kind regards,
aL3xa
I find the following solution:
stripchart(casp6~trans.factor,data.frame(casp6,trans.factor),pch=1,vertical=T)
simple and direct.
(Refer eg to http://www.mail-archive.com/r-help@r-project.org/msg34176.html)
You may be able to get close to what you want using lattice graphics by doing:
library(lattice)
xyplot(casp6 ~ trans.factor,
scales = list(x = list(at = 1:4, labels = levels(trans.factor))))
I think there's a better solution (I wrote it for a workshop a few days ago), but it slipped my mind. Here's an ugly substitute with base graphics. Feel free to annotate the x axis ad libitum. Personally, I like Greg's solution.
plot(0, 0, xlim = c(1, 4), ylim = range(casp6), type = "n")
points(casp6 ~ trans.factor)
No extra package needed
I'm a bit late to the party, but I found that you can get the desired result very easily with the standard plot function -- simply convert the factor to a numeric value:
plot(as.numeric(trans.factor), casp6)
10 year old question...but if you want a neat base R solution:
plot(trans.factor, casp6, border=NA, outline=FALSE)
points(trans.factor, casp6)
The first line sets up the plot but draws nothing. The second adds the points. This is slightly neater than the solutions that force x to be numeric.
精彩评论