How to analyze the data whose different rows have different number of elements using R?
The data format is as following, the first column is the id:
1, b, c
2, a, d, e, f
3, u, i, c
4, k, m
5, o
However, i can do n开发者_如何学JAVAothing to analyze this data. Do you have a good idea of how to read the data into R? Further, My question is: How to analyze the data whose different rows have different number of elements using R?
It seems you are trying to read a file with elements of unequal length. The structure in R that is list
.
It is possible to do this by combining read.table
with sep="\n"
and then to apply strsplit
on each row of data.
Here is an example:
dat <- "
1 A B
2 C D E
3 F G H I J
4 K L
5 M"
The code to read and convert to a list:
x <- read.table(textConnection(dat), sep="\n")
apply(x, 1, function(i)strsplit(i, "\\s")[[1]])
The results:
[[1]]
[1] "1" "A" "B"
[[2]]
[1] "2" "C" "D" "E"
[[3]]
[1] "3" "F" "G" "H" "I" "J"
[[4]]
[1] "4" "K" "L"
[[5]]
[1] "5" "M"
You can now use any list manipulation technique to work with your data.
using the readLines and strsplit to solve this problem.
text <- readLines("./xx.txt",encoding='UTF-8', n = -1L)
txt = unlist(strsplit(text, sep = " "))
精彩评论