开发者

multiple lines each based on a different dataframe in ggplot2 - automatic coloring and legend

Suppose I have the following data frames:

df1 = data.frame(c11 = c(1:5), c12 = c(1:5))
df2 = data.frame(c21 = c(1:5), c22 = (c(1:5))^0.5)
df3 = data.frame(c31 = c(1:5), c32 = (c(1:5))^2)

I want to plot these as lines in the same plot/panel. I can do this by

p <- ggplot() + geom_line(data=df1, aes(x=c11, y = c12)) + 
     geom_line(data=df2, aes(x=c21,y=c22)) + 
     geom_line(data=df3, aes(x=c31, c32))

All these will be black. If I want them in a different color, I can specify the color explicitly as an argument to geom_line(). My question is can I specify a list of a few colors, say 5 colors, such as, red, blue, green, orange, gray, and use that list so that I do not have to explicitly specify the colors as an argument to geom_line() in case of each line. If the plot p contains 2 geom_line() statements then it will color them red and blue respectively. If it contains 3 geom_line statements, it will color them red, blue and green. Finally, how can I specify the legend for these plots. Even if I can give the colors as a vector at the end of p that would be great. Please let me know if the question is not clear.开发者_Go百科

Thanks.


ggplot2 works best if you work with a melted data.frame that contains a different column to specify the different aesthetics. Melting is easier with common column names, so I'd start there. Here are the steps I'd take:

  • rename the columns
  • melt the data which adds a new variables that we'll map to the colour aesthetic
  • define your colour vector
  • Specify the appropriate scale with scale_colour_manual

'

names(df1) <- c("x", "y")
names(df2) <- c("x", "y")
names(df3) <- c("x", "y")

newData <- melt(list(df1 = df1, df2 = df2, df3 = df3), id.vars = "x")

#Specify your colour vector
cols <- c("red", "blue", "green", "orange", "gray")

#Plot data and specify the manual scale
ggplot(newData, aes(x, value, colour = L1)) + 
  geom_line() +
  scale_colour_manual(values = cols)

Edited for clarity

The structure of newData:

'data.frame':   15 obs. of  4 variables:
 $ x       : int  1 2 3 4 5 1 2 3 4 5 ...
 $ variable: Factor w/ 1 level "y": 1 1 1 1 1 1 1 1 1 1 ...
 $ value   : num  1 2 3 4 5 ...
 $ L1      : chr  "df1" "df1" "df1" "df1" ...

And the plot itself:

multiple lines each based on a different dataframe in ggplot2 - automatic coloring and legend


You dont have to melt, group or gather. Its pretty simple. Just add the color to the geom_line

library(tidyverse)

df1 = data.frame(c11 = c(1:5), c12 = c(1:5))
df2 = data.frame(c21 = c(1:5), c22 = (c(1:5))^0.5)
df3 = data.frame(c31 = c(1:5), c32 = (c(1:5))^2)

p <- ggplot() + geom_line(data=df1, aes(x=c11, y = c12), color= "red") + 
  geom_line(data=df2, aes(x=c21,y=c22), color = "blue") + 
  geom_line(data=df3, aes(x=c31, c32), color = "green")
p

multiple lines each based on a different dataframe in ggplot2 - automatic coloring and legend


These sorts of questions become much easier to solve if you adjust your thinking to the way that ggplot2 approaches graphics. ggplot2 is organized around the idea that everything that appears in your graph should (in principle) exist as a column in your data frame. (There are exceptions, of course, but this is the general idea.)

So your attempt to build this graph piece by piece, one line at a time, each coming from different data frames and then assigning colors to them is very un-ggplot2ish. If you want to label things in your graph with different colors, your first thought should always be:

How can I encode this color labeling information as a variable?

In this case, the solution is fairly simple. Simply rbind your three data frames together (you'll need to make sure the colnames match up first) and create a new column, say grp that has three levels corresponding to your three data frames:

dat <- rbind(df1,df2,df3)
dat$grp <- rep(factor(1:3),times = c(nrow(df1),nrow(df2),nrow(df3)))

and then map the variable grp to the aesthetic color in the ggplot call:

ggplot(data = dat, aes(x=...,y=...,colour = grp) + 
    geom_line()

Finally, if you don't like the default colors, you can specify your own using scale_colour_manual:

+ scale_colour_manual(value = c('green','blue','grey'))

or you can use some nice 'pre-chosen' palettes from scale_colour_brewer.

EDIT: I fixed a typo above to ensure that grp is a factor. Here's my final version:

df1 = data.frame(c1 = c(1:5), c2 = c(1:5))
df2 = data.frame(c1 = c(1:5), c2 = (c(1:5))^0.5)
df3 = data.frame(c1 = c(1:5), c2 = (c(1:5))^2)

dat <- rbind(df1,df2,df3)
dat$grp <- rep(factor(1:3),times=c(nrow(df1),nrow(df2),nrow(df3)))

ggplot(data = dat, aes(x = c1, y = c2, colour = grp)) + 
    geom_line()
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜