开发者

Get a histogram plot of factor frequencies (summary)

I've got a factor with many different values. If you execute summary(factor) the output is a list of the different values and their frequency. Like so:

A B C D
3 3 1 5

I'd like to make a histogram of the frequency values, i.e. X-axis contains the diff开发者_JAVA技巧erent frequencies that occur, Y-axis the number of factors that have this particular frequency. What's the best way to accomplish something like that?

edit: thanks to the answer below I figured out that what I can do is get the factor of the frequencies out of the table, get that in a table and then graph that as well, which would look like (if f is the factor):

plot(factor(table(f)))


Update in light of clarified Q

set.seed(1)
dat2 <- data.frame(fac = factor(sample(LETTERS, 100, replace = TRUE)))
hist(table(dat2), xlab = "Frequency of Level Occurrence", main = "")

gives:

Get a histogram plot of factor frequencies (summary)

Here we just apply hist() directly to the result of table(dat). table(dat) provides the frequencies per level of the factor and hist() produces the histogram of these data.


Original

There are several possibilities. Your data:

dat <- data.frame(fac = rep(LETTERS[1:4], times = c(3,3,1,5)))

Here are three, from column one, top to bottom:

  • The default plot methods for class "table", plots the data and histogram-like bars
  • A bar plot - which is probably what you meant by histogram. Notice the low ink-to-information ratio here
  • A dot plot or dot chart; shows the same info as the other plots but uses far less ink per unit information. Preferred.

Code to produce them:

layout(matrix(1:4, ncol = 2))
plot(table(dat), main = "plot method for class \"table\"")
barplot(table(dat), main = "barplot")
tab <- as.numeric(table(dat))
names(tab) <- names(table(dat))
dotchart(tab, main = "dotchart or dotplot")
## or just this
## dotchart(table(dat))
## and ignore the warning
layout(1)

this produces:

Get a histogram plot of factor frequencies (summary)

If you just have your data in variable factor (bad name choice by the way) then table(factor) can be used rather than table(dat) or table(dat$fac) in my code examples.

For completeness, package lattice is more flexible when it comes to producing the dot plot as we can get the orientation you want:

require(lattice)
with(dat, dotplot(fac, horizontal = FALSE))

giving:

Get a histogram plot of factor frequencies (summary)

And a ggplot2 version:

require(ggplot2)
p <- ggplot(data.frame(Freq = tab, fac = names(tab)), aes(fac, Freq)) + 
    geom_point()
p

giving:

Get a histogram plot of factor frequencies (summary)

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜