开发者

Extract file extension from file path

How can I extract the extension of a file given a file path as a character? I know I can do this via regular expression regexpr("\\.([[:alnum:]]+)$", x), but wondering if there's a built-in funct开发者_运维知识库ion to deal with this?


This is the sort of thing that easily found with R basic tools. E.g.: ??path.

Anyway, load the tools package and read ?file_ext .


Let me extend a little bit great answer from https://stackoverflow.com/users/680068/zx8754

Here is the simple code snippet

  # 1. Load library 'tools'
  library("tools")

  # 2. Get extension for file 'test.txt'
  file_ext("test.txt")

The result should be 'txt'.


simple function with no package to load :

getExtension <- function(file){ 
    ex <- strsplit(basename(file), split="\\.")[[1]]
    return(ex[-1])
} 


The regexpr above fails if the extension contains non-alnum (see e.g. https://en.wikipedia.org/wiki/List_of_filename_extensions) As an altenative one may use the following function:

getFileNameExtension <- function (fn) {
# remove a path
splitted    <- strsplit(x=fn, split='/')[[1]]   
# or use .Platform$file.sep in stead of '/'
fn          <- splitted [length(splitted)]
ext         <- ''
splitted    <- strsplit(x=fn, split='\\.')[[1]]
l           <-length (splitted)
if (l > 1 && sum(splitted[1:(l-1)] != ''))  ext <-splitted [l] 
# the extention must be the suffix of a non-empty name    
ext

}


extract file extension only without dot:

tools::file_ext(fileName)

extract file extension with dot:

paste0(".", tools::file_ext(fileName))


If you don't want to use any additional package you could try

file_extension <- function(filenames) {
    sub(pattern = "^(.*\\.|[^.]+)(?=[^.]*)", replacement = "", filenames, perl = TRUE)
    }

If you like to be cryptic you could try to use it as a one-line expression: sub("^(.*\\.|[^.]+)(?=[^.]*)", "", filenames, perl = TRUE) ;-)

It works for zero (!), one or more file names (as character vector or list) with an arbitrary number of dots ., and also for file names without any extension where it returns the empty character "".

Here the tests I tried:

> file_extension("simple.txt")
[1] "txt"
> file_extension(c("no extension", "simple.ext1", "with.two.ext2", "some.awkward.file.name.with.a.final.dot.", "..", ".", ""))
[1] ""     "ext1" "ext2" ""     ""     ""     ""    
> file_extension(list("file.ext1", "one.more.file.ext2"))
[1] "ext1" "ext2"
> file_extension(NULL)
character(0)
> file_extension(c())
character(0)
> file_extension(list())
character(0)

By the way, tools::file_ext() has trouble finding "strange" extensions with non-alphanumeric characters:

> tools::file_ext("file.zi_")
[1] ""


This function uses pipes:

library(magrittr)

file_ext <- function(f_name) {
  f_name %>%
    strsplit(".", fixed = TRUE) %>%
    unlist %>%
    extract(2)
 }

 file_ext("test.txt")
 # [1] "txt"


Simplest way I've found with no additional packages:

FileExt <- function(filename) {
  nameSplit <- strsplit(x = filename, split = "\\.")[[1]]
  return(nameSplit[length(nameSplit)])
}


A way would be to use sub.

s <- c("test.txt", "file.zi_", "noExtension", "with.two.ext2",
       "file.with.final.dot.", "..", ".", "")

sub(".*\\.|.*", "", s, perl=TRUE)
#[1] "txt"  "zi_"  ""     "ext2" ""     ""     ""     ""    

Assuming there is a dot - which will fail in case there is no extension:

sub(".*\\.", "", s)
#[1] "txt"         "zi_"         "noExtension" "ext2"        ""           
#[6] ""            ""            ""           

For comparison tools::file_ext(s) and the code with inside used regex.

tools::file_ext(s)
#[1] "txt"  ""     ""     "ext2" ""     ""     ""     ""    

pos <- regexpr("\\.([[:alnum:]]+)$", s)
ifelse(pos > -1L, substring(s, pos + 1L), "")
#[1] "txt"  ""     ""     "ext2" ""     ""     ""     ""    
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜