开发者

Reshape Melt - Easiest way to refer to correct ID and MEASURED columns

Problem

I want to melt a dataframe with many columns, using very little typing.

The dataframes that I work with often have many columns of which IDs may be numbers, characters or factors. The IDs and MEASURED columns generally aren't contiguous.

What do I do?

Is there something like melt(mydata, id=c(1:7,9,10,12), measured=c(8,11)?

Example

I have a dataframe that looks something like the following

id1 <- round(abs(rnorm(5)),1)
id2 <- sample(letters,5)
id3 <- sample(letters,5)
id4 <- sample(letters,5)
id5 <- sample(letters,5)
id6 <- round(abs(rnorm(5)),1)
id7 <- sample(letters,5)
m1 <-  round(abs(rnorm(5)),1)
id8 <- sample(letters,5)
id9 <- sample(letters,5)
m2 <-  round(abs(rnorm(5)),1)
id10 <- sample(letters,5)    
mydata <- data.frame(id1,id2,id3,id4,id5,id6,id7,m1,id8,id9,m2,id10)

Resulting in....

  id1 id2 id3 id4 id5 id6 id7  m1 id8 id9  m2 id10
1.5   c   i   r   m 1.8   f 0.1   x   g 0.7    t
0.4   n   o   q   b 0.9   s 0.1   f   x 0.0    m
1.6   b   g   s   i 0.7   i 0.5   d   z 1.3    b
0.6   g   s   j   k 0.3   j 0.8   p   i 0.4    d
开发者_JS百科0.5   z   e   i   s 0.4   r 0.8   k   y 0.9    a

Where "id" means columns that I want as IDs and "m" means columns that I want as MEASURED variables. NOTE: my columns don't actually follow the pattern "id_" or "m_" - they could be anything.

How do I go about correctly and quickly getting melt to work as I intend?

I'd prefer not to have to write out

melt(mydata, id = c("id1","id2",etc, etc, etc), measured = c("m1","m2))

If all my ID variables were characters, I know I could just write

melt(mydata, measured = c("m1","m2))

But since I have character/factor ID columns, I get this (incorrect) output

x   id2 id3 id4 id5 id7 id8 id9 id10 variable value
1    c   i   r   m   f   x   g    t      id1   1.5
2    n   o   q   b   s   f   x    m      id1   0.4
3    b   g   s   i   i   d   z    b      id1   1.6
4    g   s   j   k   j   p   i    d      id1   0.6
5    z   e   i   s   r   k   y    a      id1   0.5
6    c   i   r   m   f   x   g    t      id6   1.8
7    n   o   q   b   s   f   x    m      id6   0.9
8    b   g   s   i   i   d   z    b      id6   0.7
9    g   s   j   k   j   p   i    d      id6   0.3
10   z   e   i   s   r   k   y    a      id6   0.4
11   c   i   r   m   f   x   g    t       m1   0.1
12   n   o   q   b   s   f   x    m       m1   0.1
13   b   g   s   i   i   d   z    b       m1   0.5
14   g   s   j   k   j   p   i    d       m1   0.8
15   z   e   i   s   r   k   y    a       m1   0.8
16   c   i   r   m   f   x   g    t       m2   0.7
17   n   o   q   b   s   f   x    m       m2   0.0
18   b   g   s   i   i   d   z    b       m2   1.3
19   g   s   j   k   j   p   i    d       m2   0.4
20   z   e   i   s   r   k   y    a       m2   0.9

If my dataframe ID and MEASURED columns were contiguous like this

mydata <- data.frame(id1,id2,id3,id4,id5,id6,id7,id8,id9,id10,m1,m2)

then I'd have an easy time using ranges like

melt(mydata, id=1:10, measured = 11:12)

But what do I do if my ID/Measured columns aren't contiguous?

In all of the documentation that I've seen on reshape, including Hadley's papers/presentations, I haven't seen how to easily do this.

I'm sure I'm missing something very simple here...


Okay, I just realised that I can just nest my c's like this

melt(mydata, id=c(c(1:7),9,10,12)

Sorry to have overcomplicated the question :-\

EDIT: Wow. I thought I tried c(1:7,9,10,12) and R complained. But I try it now it's fine. It's been a loooooooong day.


you need to condition on something, be it a pattern in the column name, the column's class, or some lists you populated earlier in your code. R can't magically figure out which columns are which.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜