Reshape Melt - Easiest way to refer to correct ID and MEASURED columns
Problem
I want to melt a dataframe with many columns, using very little typing.
The dataframes that I work with often have many columns of which IDs may be numbers, characters or factors. The IDs and MEASURED columns generally aren't contiguous.
What do I do?
Is there something like melt(mydata, id=c(1:7,9,10,12), measured=c(8,11)
?
Example
I have a dataframe that looks something like the following
id1 <- round(abs(rnorm(5)),1)
id2 <- sample(letters,5)
id3 <- sample(letters,5)
id4 <- sample(letters,5)
id5 <- sample(letters,5)
id6 <- round(abs(rnorm(5)),1)
id7 <- sample(letters,5)
m1 <- round(abs(rnorm(5)),1)
id8 <- sample(letters,5)
id9 <- sample(letters,5)
m2 <- round(abs(rnorm(5)),1)
id10 <- sample(letters,5)
mydata <- data.frame(id1,id2,id3,id4,id5,id6,id7,m1,id8,id9,m2,id10)
Resulting in....
id1 id2 id3 id4 id5 id6 id7 m1 id8 id9 m2 id10
1.5 c i r m 1.8 f 0.1 x g 0.7 t
0.4 n o q b 0.9 s 0.1 f x 0.0 m
1.6 b g s i 0.7 i 0.5 d z 1.3 b
0.6 g s j k 0.3 j 0.8 p i 0.4 d
开发者_JS百科0.5 z e i s 0.4 r 0.8 k y 0.9 a
Where "id" means columns that I want as IDs and "m" means columns that I want as MEASURED variables. NOTE: my columns don't actually follow the pattern "id_" or "m_" - they could be anything.
How do I go about correctly and quickly getting melt to work as I intend?
I'd prefer not to have to write out
melt(mydata, id = c("id1","id2",etc, etc, etc), measured = c("m1","m2))
If all my ID variables were characters, I know I could just write
melt(mydata, measured = c("m1","m2))
But since I have character/factor ID columns, I get this (incorrect) output
x id2 id3 id4 id5 id7 id8 id9 id10 variable value
1 c i r m f x g t id1 1.5
2 n o q b s f x m id1 0.4
3 b g s i i d z b id1 1.6
4 g s j k j p i d id1 0.6
5 z e i s r k y a id1 0.5
6 c i r m f x g t id6 1.8
7 n o q b s f x m id6 0.9
8 b g s i i d z b id6 0.7
9 g s j k j p i d id6 0.3
10 z e i s r k y a id6 0.4
11 c i r m f x g t m1 0.1
12 n o q b s f x m m1 0.1
13 b g s i i d z b m1 0.5
14 g s j k j p i d m1 0.8
15 z e i s r k y a m1 0.8
16 c i r m f x g t m2 0.7
17 n o q b s f x m m2 0.0
18 b g s i i d z b m2 1.3
19 g s j k j p i d m2 0.4
20 z e i s r k y a m2 0.9
If my dataframe ID and MEASURED columns were contiguous like this
mydata <- data.frame(id1,id2,id3,id4,id5,id6,id7,id8,id9,id10,m1,m2)
then I'd have an easy time using ranges like
melt(mydata, id=1:10, measured = 11:12)
But what do I do if my ID/Measured columns aren't contiguous?
In all of the documentation that I've seen on reshape, including Hadley's papers/presentations, I haven't seen how to easily do this.
I'm sure I'm missing something very simple here...
Okay, I just realised that I can just nest my c's like this
melt(mydata, id=c(c(1:7),9,10,12)
Sorry to have overcomplicated the question :-\
EDIT: Wow. I thought I tried c(1:7,9,10,12) and R complained. But I try it now it's fine. It's been a loooooooong day.
you need to condition on something, be it a pattern in the column name, the column's class, or some lists you populated earlier in your code. R can't magically figure out which columns are which.
精彩评论