How to plot nice graph using few commands, separating drawing logic from layout?
Is there a simple way to make a nice plot of the following data in R, without using many commands?
Region1 Region2
2007 17 55
2008 26 43
2009 53 70
2010 96 58
I do know how to plot the data, but it uses too many commands and parameters, and the result still looks ab开发者_运维问答solutely terrible (see here):
> test <- read.table("/tmp/data.txt")
> png(filename="/tmp/test.png", height=1000, width=750, bg="white", res=300)
> plot(test$Region1, type="b", col="blue", ylim=c(0,100), lwd=3)
> lines(test$Region2, type="b", col="red", lwd=3)
> dev.off()
It took me a while to figure out all the commands, and I still have to get the x axis labels (2007, 2008, ...), using the axis
command (but how do I access the test
x axis labels?), etc.
In Keynote (or Powerpoint) I can just give it the same table (transposed) and it produces a nice graph from it (see here).
My question is really: Is there a higher-level command that draws such typical data nicely? Also, how can I separate the drawing logic (draw 2 lines from that specific data, etc.) from the layout (use specific colors and line types for the graph, etc.)? Ideally, I'd hope there were different libraries for different layouts of the graph, e.g. called NiceKeynoteLayout
, which I just could use like this (or similar):
> d <- read.table("/tmp/data.txt")
> png <- png(filename="/tmp/test.png", height=1000, width=750)
> myLayout <- loadPredefinedLayout("NiceKeynoteLayout")
> coolplot(d, layout=myLayout, out=png)
Yes, and in my biased opinion, you're best off using the ggplot2 package for creating graphics. Here's how you might do so with your data (thanks to Dirk for providing a sample datset)
data <- data.frame(Year=seq(as.Date("2007-01-01"),
as.Date("2010-01-01"), by="year"),
Region1=c(17,26,53,96), Region2=c(55,43,70,58))
library(ggplot2)
# Convert data to a form optimised for visualisation, not
# data entry
data2 <- melt(data, measure = c("Region1", "Region2"))
# Define the visualisation you want
ggplot(data2, aes(x = Year, y = value, colour = variable)) +
geom_line()
Here is R
code that plots the data in a nice way (it is not simple code as requested, but at least the result looks good):
test <- read.table("/tmp/test.txt", header=TRUE)
png(filename="/tmp/test.png", height=750, width=1000,
bg="white", res=300)
par(mar=c(2.5,2.5,0.75,0.75),
family="Gill Sans", font=1, # font 2 would be bold
cex=0.75, cex.lab=0.75, cex.axis=0.75)
mymax <- max(test$Region1, test$Region2)*1.25
plot(test$Region1, type="b", col="#304E67",
ylim=c(0, mymax), lwd=3,
bty="l", axes=FALSE, ann=FALSE, cex=1.0, tck=1)
axis(1, lwd.ticks=0, at=1:length(test$Year), lab=test$Year)
axis(2, lwd=0, las=1, at=c(0,25,50,75,100), yaxp=c(0,100,4))
# grid(nx = NA, ny = 5, col = "lightgray") # wrong, see axTicks
for(y in c(25, 50, 75, 100)) {
lines(rep(y, length(test$Region1)), type="l", col="lightgray", lwd=1)
}
lines(test$Region1, type="b", col="#304E67", lwd=3)
lines(test$Region2, type="b", col="#974449", lwd=3)
# title(xlab="Year", col.lab=rgb(0,0.5,0))
# title(ylab="Output", col.lab=rgb(0,0.5,0))
legend(1, mymax+8, c("Region 1","Region 2"), cex=0.75,
col=c("#304E67" ,"#974449"),
pch=1:1, # circles
lty=1:1, # solid
lwd=1.5, # line width
bty="n") # no box around
dev.off()
The data file has this content:
Year Region1 Region2
2007 17 55
2008 26 43
2009 53 70
2010 96 58
It produces the following graph:
which comes pretty close to the graph that Keynote draws:
You may want to read up on help(par)
which is a very useful source of information for customizing standard R graphs. This allows you to
- have tighter outer margins (eg
par(mar=c(3,3,1,1)
) - change fonts (eg
par(cex=0.7)
or some of the more specific cex alternatives - set colors or linetypes
- ...
all of which comes close to your desired loadPredefinedLayout()
functionality you desire.
Lastly, for the axes you are better off to either use a time-aware class like zoo
, or to explicit give the x-axis argument as in the example below:
R> data <- data.frame(Year=seq(as.Date("2007-01-01"), \
as.Date("2010-01-01"), by="year"), \
Region1=c(17,26,53,96), Region2=c(55,43,70,58))
R> data
Year Region1 Region2
1 2007-01-01 17 55
2 2008-01-01 26 43
3 2009-01-01 53 70
4 2010-01-01 96 58
R> par(mar=c(3,4,1,1))
R> plot(data$Year, data$Region1, type='l', col='blue', ylab="Values")
R> lines(data$Year, data$Region2, col='red')
R>
A (in my opinion) slightly improved version of the graphic suggested by Hadley. I think now it is pretty much like the original graphic you tried to replicate (even better, actually, with direct labels).
After converting the data as suggested by Hadley,
plot <- ggplot(data2, aes(Year, value, group = variable,
colour = variable)) + geom_line(size = 1) +
opts(legend.position = "none")
plot <- plot + geom_point () + opts(legend.position = "none")
plot + geom_text(data = data2[data2$year == 2010,
], aes(label = variable), hjust = 1.2, vjust = 1)
精彩评论