开发者

How to read whitespace delimited strings until EOF in R

I am new to R and I am currently having trouble with reading a series of strings until I encounter an EOF. Not only I don't know how to detect EOF, but I also don't know how to read a single string separated by whitespace which is trivial to do in any other language I开发者_Python百科 have seen so far. In C, I would simply do:

while (scanf("%s", s) == 1) { /* do something with s */ }

If possible, I would prefer a solution which does not require knowing the maximum length of strings in advance.

Any ideas?

EDIT: I am looking for solution which does not store all the input into memory, but the one equivalent or at least similar to the C code above.


Here's a way to read one item at a time... It uses the fact that scan has an nmax parameter (and n and nlines - it's actually kind of a mess!).

# First create a sample file to read from...
writeLines(c("Hello world", "and now", "Goodbye"), "foo.txt")

# Use a file connection to read from...
f <- file("foo.txt", "r")

i <- 0L
repeat {
   s <- scan(f, "", nmax=1, quiet=TRUE)
   if (length(s) == 0) break
   i <- i + 1L
   cat("Read item #", i, ": ", s, "\n", sep="")
}
close(f)

When scan encounters EOF, it returns a zero-length vector. So a more obscure but C-like way would be:

while (length(s <- scan(f, "", nmax=1, quiet=TRUE))) {
   i <- i + 1L
   cat("Read item #", i, ": ", s, "\n", sep="")
}

In any case, the output would be:

Read item #1: Hello
Read item #2: world
Read item #3: and
Read item #4: now
Read item #5: Goodbye

Finally, if you could vectorize what you do to the strings, you should probably try to read a bunch of them at a time - just change nmax to, say, 10000.


> txt <- "This is an example"  # could be from a file but will use textConnection()
> read.table(textConnection(txt))
    V1 V2 V3      V4
1 This is an example

read.table is implemented with scan, so you can just look at the code to see how the experts did it.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜