How do I scale the y-axis on a histogram by the x values in R?
I have some data which represents a sizes of particles. I want to plot the frequency of each binned-size of particles as a histogram, but scale the frequency but the size of the particle (so it represents total mass at that size.)
I can plot a histogram fine, but I am unsure 开发者_运维问答how to scale the Y-axis by the X-value of each bin.
e.g. if I have 10 particles in the 40-60 bin, I want the Y-axis value to be 10*50=500.
You would better use barplot in order to represent the total mass by the area of the bins (i.e. height gives the count, width gives the mass):
sizes <- 3:10 #your sizes
part.type <- sample(sizes, 1000, replace = T) #your particle sizes
count <- table(part.type)
barplot(count, width = size)
If your particle sizes are all different, you should first cut the range into appropriate number of intervals in order to create part.type factor:
part <- rchisq(1000, 10)
part.type <- cut(part, 4)
count <- table(part.type)
barplot(count, width = size)
If the quantity of interest is only total mass. Then, the appropriate plot is the dotchart. It is also much clearer comparing to the bar plot for a large number of sizes:
part <- rchisq(1000, 10)
part.type <- cut(part, 20)
count <- table(part.type)
dotchart(count)
Representing the total mass with bins would be misleading because the area of the bins is meaningless.
if you really want to use the mid point of each bin as a scaling factor:
d<-rgamma(100,5,1.5) # sample
z<-hist(d,plot=FALSE) # make histogram, i.e., divide into bins and count up
co<-z$counts # original counts of each bin
z$counts<-z$counts*z$mids # scaled by mids of the bin
plot(z, xlim=c(0,10),ylim=c(0,max(z$counts))) # plot scaled histogram
par(new=T)
plot(z$mids,co,col=2, xlim=c(0,10),ylim=c(0,max(z$counts))) # overplot original counts
instead, if you want to use the actual value of each sample point as a scaling factor:
d<-rgamma(100,5,1.5)
z<-hist(d,plot=FALSE)
co<-z$counts # original counts of each bin
z$counts<-aggregate(d,list(cut(d,z$breaks)),sum)$x # sum up the value of data in each bin
plot(z, xlim=c(0,10),ylim=c(0,max(z$counts))) # plot scaled histogram
par(new=T)
plot(z$mids,co,col=2, xlim=c(0,10),ylim=c(0,max(z$counts))) # overplot original counts
Just hide the axes and replot them as needed.
# Generate some dummy data
datapoints <- runif(10000, 0, 100)
par (mfrow = c(2,2))
# We will plot 4 histograms, with different bin size
binsize <- c(1, 5, 10, 20)
for (bs in binsize)
{
# Plot the histogram. Hide the axes by setting axes=FALSE
h <- hist(datapoints, seq(0, 100, bs), col="black", axes=FALSE,
xlab="", ylab="", main=paste("Bin size: ", bs))
# Plot the x axis without modifying it
axis(1)
# This will NOT plot the axis (lty=0, labels=FALSE), but it will return the tick values
yax <- axis(2, lty=0, labels=FALSE)
# Plot the axis by appropriately scaling the tick values
axis(2, at=yax, labels=yax/bs)
}
精彩评论