How do you read multiple .txt files into R? [duplicate]

2023-01-10 13:11 问答作者：

This question already has answers here: How to import multiple .csv files at once? (15 answers) Closed 4 years ago.

I'm using R to visualize some data all of which is in .txt format. There are a few hundred files in a directory and I want to load it all into one table, in one shot.

Any help?

EDIT:

Listing the files is not a problem. But I am having trouble going from list to content. I've tried some of the code from here, but I get a bug with this part:

all.the.data <- lapply( all.the.files,  txt  , header=TRUE)

saying

 Error in match.fun(FUN) : object 'txt' not found

Any snippets of code that would clarify this problem would be greatl开发者_StackOverflowy appreciated.

You can try this:

filelist = list.files(pattern = ".*.txt")

#assuming tab separated values with a header    
datalist = lapply(filelist, function(x)read.table(x, header=T)) 

#assuming the same header/columns for all files
datafr = do.call("rbind", datalist)

There are three fast ways to read multiple files and put them into a single data frame or data table

First get the list of all txt files (including those in sub-folders)

list_of_files <- list.files(path = ".", recursive = TRUE,
                            pattern = "\\.txt$", 
                            full.names = TRUE)

1) Use fread() w/ rbindlist() from the data.table package

#install.packages("data.table", repos = "https://cran.rstudio.com")
library(data.table)

# Read all the files and create a FileName column to store filenames
DT <- rbindlist(sapply(list_of_files, fread, simplify = FALSE),
                use.names = TRUE, idcol = "FileName")

2) Use readr::read_table2() w/ purrr::map_df() from the tidyverse framework:

#install.packages("tidyverse", 
#                 dependencies = TRUE, repos = "https://cran.rstudio.com")
library(tidyverse)

# Read all the files and create a FileName column to store filenames
df <- list_of_files %>%
  set_names(.) %>%
  map_df(read_table2, .id = "FileName")

3) (Probably the fastest out of the three) Use vroom::vroom():

#install.packages("vroom", 
#                 dependencies = TRUE, repos = "https://cran.rstudio.com")
library(vroom)

# Read all the files and create a FileName column to store filenames
df <- vroom(list_of_files, .id = "FileName")

Note: to clean up file names, use basename or gsub functions

Benchmark: readr vs data.table vs vroom for big data

How do you read multiple .txt files into R? [duplicate]

Edit 1: to read multiple csv files and skip the header using readr::read_csv

list_of_files <- list.files(path = ".", recursive = TRUE,
                            pattern = "\\.csv$", 
                            full.names = TRUE)

df <- list_of_files %>%
  purrr::set_names(nm = (basename(.) %>% tools::file_path_sans_ext())) %>%
  purrr::map_df(read_csv, 
                col_names = FALSE,
                skip = 1,
                .id = "FileName")

Edit 2: to convert a pattern including a wildcard into the equivalent regular expression, use glob2rx()

There is a really, really easy way to do this now: the readtext package.

readtext::readtext("path_to/your_files/*.txt")

It really is that easy.

Look at the help for functions dir() aka list.files(). This allows you get a list of files, possibly filtered by regular expressions, over which you could loop.

If you want to them all at once, you first have to have content in one file. One option would be to use cat to type all files to stdout and read that using popen(). See help(Connections) for more.

Thanks for all the answers!

In the meanwhile, I also hacked a method on my own. Let me know if it is any useful:

library(foreign)

setwd("/path/to/directory")

files <-list.files()

data <- 0


for (f in files) {

tempData = scan( f, what="character")

data <- c(data,tempData)    

}

继续阅读：fread lapply r read.table readr

How do you read multiple .txt files into R? [duplicate]

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集 河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？