开发者

Fast assessment of corrupted Affymetrix CEL files

I'm trying to normalize a big amount of Affymetrix CEL files using R. However, some of them appear to be truncated, so when reading them i get the error

Cel file xxx does not seem to have the correct dimensions

And the normalization stops. Manually removing the corrupted files and restart every time will take very long. Do you know if there is a fast way (in R or with a tool) to detect corrupted files?

PS I'm 99.99% sure I'm normalizing together CELs from the same platform, it's reall开发者_如何学JAVAy just truncated files :-)


One simple suggestion:

Can you just use a tryCatch block around your read.table (or whichever read command you're using)? Then just skip a file if you get that error message. You can also compile a list of corrupted files within the catch block (I recommend doing that so that you are tracking corrupted files for future reference when running a big batch process like this). Here's the pseudo code:

corrupted.files <- data.frame()
for(i in 1:nrow(files)) {
    x <- tryCatch(read.table(file=files[i]), error = function(e) 
         if(e=="something") { corrupted.files <- rbind(corrupted.files, files[i]) } 
         else { stop(e) }, 
       finally=print(paste("finished with", files[i], "at", Sys.time())))
    if(nrow(x)) # do something with the uncorrupted data            
}
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜