R split column depending on values
I have the following data and I want to calculate the total number of minutes and I was wondering if its possible to split the column into two with Minutes in one column and seconds in another column?
> q
time
1 0m 22s
2 1m 7s
3 3m 35s
4 11m 43s
5 1m 8s
6 2m 21s
7 9m 33s
8 0m 56s
9 0m 2s
10 0m 2s
11 0m 50s
12 0m 25s
13 0m 33s
14 2m 26s
15 0m 20s
16 1m 47s
17 0m 36s
18 0m 3s
19 0m 2s
20 0m 5s
==> To give:
> q
min seconds
1 0 22
2 开发者_开发问答 1 7
etc
I am not so familiar with dates but you can look into the functions as.Date
or strptime
.
Using your data.frame:
df <- data.frame(time = c("0m 22s", "1m 7s", "3m 35s", "11m 43s", "1m 8s", "2m 21s", "9m 33s", "0m 56s", "0m 2s", "0m 2s", "0m 50s", "0m 25s", "0m 33s", "2m 26s", "0m 20s", "1m 47s", "0m 36s", "0m 3s", "0m 2s", "0m 5s"))
df$time.2 <- strptime(df$time, "%Mm %Ss")
now you can select the specific values, just take a look at
attributes(df[, "time.2"])
and assign
df$min <- df[, "time.2"][["min"]]
df$sec <- df[, "time.2"][["sec"]]
this gives:
R> df
time time.2 min sec
1 0m 22s 2010-12-02 00:00:22 0 22
2 1m 7s 2010-12-02 00:01:07 1 7
3 3m 35s 2010-12-02 00:03:35 3 35
4 11m 43s 2010-12-02 00:11:43 11 43
5 1m 8s 2010-12-02 00:01:08 1 8
6 2m 21s 2010-12-02 00:02:21 2 21
7 9m 33s 2010-12-02 00:09:33 9 33
8 0m 56s 2010-12-02 00:00:56 0 56
9 0m 2s 2010-12-02 00:00:02 0 2
10 0m 2s 2010-12-02 00:00:02 0 2
11 0m 50s 2010-12-02 00:00:50 0 50
12 0m 25s 2010-12-02 00:00:25 0 25
13 0m 33s 2010-12-02 00:00:33 0 33
14 2m 26s 2010-12-02 00:02:26 2 26
15 0m 20s 2010-12-02 00:00:20 0 20
16 1m 47s 2010-12-02 00:01:47 1 47
17 0m 36s 2010-12-02 00:00:36 0 36
18 0m 3s 2010-12-02 00:00:03 0 3
19 0m 2s 2010-12-02 00:00:02 0 2
20 0m 5s 2010-12-02 00:00:05 0 5
EDIT:
since you only want to split the data.frame in order to be able to calculate the total sum of minutes, you do not even to create the new columns min
and sec
and can simply work with the column time.2
.
those two steps are already enough
df$time.2 <- strptime(df$time, "%Mm %Ss")
sum(df[, "time.2"][["min"]])
R> [1] 30
If you want a fast solution then you should consider solution based on gsub
:
min <- as.numeric(sub("m.*$", "", time))
sec <- as.numeric(gsub("^.*\\ |s$", "", time))
There are a few threads on StackOverflow using gsub
:
- How do I specify a dynamic position for the start of substring
- How to get the second sub element of every element in a list in R
- How to pick up a set of numbers from the end of lines with irregular length in R
NOTE: I'm sure there are more elegant methods, but this is the first solution that came to mind.
Step 1) get rid of characters (including trailing spaces):
Data <- q
minsec_str <- apply(Data,1, function(x) gsub("[[:alpha:]]| $","",x))
Step 2) Split into two strings, convert strings to numeric, and rbind
minsec <- do.call(rbind, lapply(strsplit(minsec_str, " "), as.numeric))
Step 3) Add colnames and convert to data.frame
colnames(minsec) <- c("min","sec")
minsec <- data.frame(minsec)
精彩评论