开发者

R split column depending on values

I have the following data and I want to calculate the total number of minutes and I was wondering if its possible to split the column into two with Minutes in one column and seconds in another column?

> q
       time
1   0m 22s 
2    1m 7s 
3   3m 35s 
4  11m 43s 
5    1m 8s 
6   2m 21s 
7   9m 33s 
8   0m 56s 
9    0m 2s 
10   0m 2s 
11  0m 50s 
12  0m 25s 
13  0m 33s 
14  2m 26s 
15  0m 20s 
16  1m 47s 
17  0m 36s 
18   0m 3s 
19   0m 2s 
20   0m 5s 

==> To give:

> q
    min    seconds
1   0     22
2 开发者_开发问答  1     7

etc


I am not so familiar with dates but you can look into the functions as.Date or strptime. Using your data.frame:

df <- data.frame(time = c("0m 22s", "1m 7s", "3m 35s", "11m 43s", "1m 8s", "2m 21s", "9m 33s", "0m 56s", "0m 2s", "0m 2s", "0m 50s", "0m 25s", "0m 33s", "2m 26s", "0m 20s", "1m 47s", "0m 36s", "0m 3s", "0m 2s", "0m 5s"))

df$time.2 <- strptime(df$time, "%Mm %Ss")

now you can select the specific values, just take a look at

attributes(df[, "time.2"])

and assign

df$min <- df[, "time.2"][["min"]]
df$sec <- df[, "time.2"][["sec"]]

this gives:

R> df
      time              time.2 min sec
1   0m 22s 2010-12-02 00:00:22   0  22
2    1m 7s 2010-12-02 00:01:07   1   7
3   3m 35s 2010-12-02 00:03:35   3  35
4  11m 43s 2010-12-02 00:11:43  11  43
5    1m 8s 2010-12-02 00:01:08   1   8
6   2m 21s 2010-12-02 00:02:21   2  21
7   9m 33s 2010-12-02 00:09:33   9  33
8   0m 56s 2010-12-02 00:00:56   0  56
9    0m 2s 2010-12-02 00:00:02   0   2
10   0m 2s 2010-12-02 00:00:02   0   2
11  0m 50s 2010-12-02 00:00:50   0  50
12  0m 25s 2010-12-02 00:00:25   0  25
13  0m 33s 2010-12-02 00:00:33   0  33
14  2m 26s 2010-12-02 00:02:26   2  26
15  0m 20s 2010-12-02 00:00:20   0  20
16  1m 47s 2010-12-02 00:01:47   1  47
17  0m 36s 2010-12-02 00:00:36   0  36
18   0m 3s 2010-12-02 00:00:03   0   3
19   0m 2s 2010-12-02 00:00:02   0   2
20   0m 5s 2010-12-02 00:00:05   0   5

EDIT: since you only want to split the data.frame in order to be able to calculate the total sum of minutes, you do not even to create the new columns min and sec and can simply work with the column time.2. those two steps are already enough

df$time.2 <- strptime(df$time, "%Mm %Ss")
sum(df[, "time.2"][["min"]])

R> [1] 30


If you want a fast solution then you should consider solution based on gsub:

min <- as.numeric(sub("m.*$", "", time))
sec <- as.numeric(gsub("^.*\\ |s$", "", time))

There are a few threads on StackOverflow using gsub:

  • How do I specify a dynamic position for the start of substring
  • How to get the second sub element of every element in a list in R
  • How to pick up a set of numbers from the end of lines with irregular length in R


NOTE: I'm sure there are more elegant methods, but this is the first solution that came to mind.

Step 1) get rid of characters (including trailing spaces):

Data <- q
minsec_str <- apply(Data,1, function(x) gsub("[[:alpha:]]| $","",x))

Step 2) Split into two strings, convert strings to numeric, and rbind

minsec <- do.call(rbind, lapply(strsplit(minsec_str, " "), as.numeric))

Step 3) Add colnames and convert to data.frame

colnames(minsec) <- c("min","sec")
minsec <- data.frame(minsec)
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜