How to Split Dataset and plot in R
I am using a data set like:
1 48434 14566
1 56711 6289
1 58826 4174
2 56626 6374
2 58888 4112
2 59549 3451
2 60020 2980
2 60468 2532
3 56586 6414
3 58691 4309
3 59360 3640
3 59941 3059
.
.
.
10 56757 6243
10 58895 4105
10 59565 3435
10 60120 2880
10 60634 2366
I need a plot in R of 3rd column for each value of first column i.e. for above data there would be 10 different plots of (each group开发者_运维问答 1-10) of values of 3rd column. x-axis is number of Iterations and Y-axis is the values with max 63000. I also need to connect the dots with a line in color red. I am new to R and have been reading documentation but that confused me more. could any body plz help.
EDIT: I actually want line graph of V3 values. the number of rows of v3 column would be on x-axis and v3 values on y-axis. And I want different graphs each for a group indicated by v1. Chase's solution works except that I want the axis shifted, the V3 values should be on y-axis.here is example
EDIT2: @Roman, Here is the code I am executing.
library(lattice)
d <- read.delim("c:\\proj58\\positions23.txt",sep="")
d <- do.call(rbind, lapply(split(d, d$V1), function(x) {
x$iterations <- order(x$V3, decreasing=TRUE)
x
}))
xyplot(V3 ~ iterations | V1, type="l", data=d)
This is the error I get,
>
> source("C:\\proj58\\plots2.R")
> d
V1 V2 V3 iterations
1.1 1 48434 14566 1
1.2 1 56711 6289 2
1.3 1 58826 4174 3
1.4 1 59528 3472 4
I am not getting any plot?? what am I missing OK: Got It. don't know what was wrong. Here it is,
2 more things, how to change V1 labels on the boxes to actual numbers like 1,2,... secondly I have files that contain 100 groups, I tried one and it made all graphs on a single page (unreadable obviously), can I make these on more than one windows?
Well, first you need to create a variable with the row number, for each subset of the first variable separately. Here's one way to do it, by splitting the data set by the first variable, making a new variable that has the row number, and recombining.
You also probably want V1 to be a factor (a categorical variable).
d <- do.call(rbind, lapply(split(d, d$V1), function(x) {
x$iterations <- 1:nrow(x)
x
}))
d$V1 <- factor(d$V1)
Then using the lattice
library, you'd do something like
xyplot(V3 ~ iterations | V1, type="l", data=d)
To make the plots appear on more than one page, limit the number of plots on a page using the layout
option. You'll need to save the plot to a file that supports multi-page output to do that. For example, for 5 rows and 5 columns:
trellis.device("pdf", file="myplot.pdf")
p <- xyplot(V3 ~ iterations | V1, type="l", data=d, layout=c(5,5))
plot(p)
dev.off()
Also, to make the plot appear when running the code using source
, you need to specifically plot the output from the xyplot command, like
p <- xyplot(...)
plot(p)
When running at the console, this is not necessary as the plot
(well, actually, the print
function) is called on it by default.
Like Chase said, please clarify on your question so that we can envision better what you're trying to achieve. To add to the heap of confusion, here's a lattice
ballpark solution of what I think you may be after.
library(lattice)
fdt <- data.frame(col1 = seq(from = 1, to = 10, each = 10),
col2 = round(56 * rnorm(100, mean = 30, sd = 5)),
col3 = round(20 * rnorm(100, mean = 11,)))
xyplot(col3 ~ 1:100 | col1, data = fdt)
I'm not exactly following what it is that you want to plot, but here's an approach that should get your down the right path and you can fill in the appropriate plotting command...or clarify your question and explain what the final result of your plot should look like in more detail.
We are going to take advantage of two packages: plyr
and ggplot2
. We will use plyr
to split up your data into the appropriate groups and then use ggplot2
for the actual plotting. We'll take advantage of the pdf()
function and put a different plot on each page.
library(ggplot2)
library(psych) #For copying in data, not needed beyond that.
df <- read.clipboard(header = F)
pdf("test.pdf")
d_ply(df, "V1", function(x) #Split on the first column
print(qplot(x$V3)) #Your plotting command should go here. This plots histograms.
)
dev.off() #Close the plotting device.
This will generate an n page PDF where n represents the number of groups in V1 (your splitting column). If you'd rather have JPEG outputs, look at ?jpeg or the other graphics options for making other outputs.
EDIT: As you can see, people interpreted your question in a few ways. If @Roman's solution is more what you want, here's roughly the same ggplot code
qplot(col2, col3, data = fdt, geom = "point") + facet_wrap(~ col1 , nrow = 2)
精彩评论