Extract file extension from file path
How can I extract the extension of a file given a file path as a character? I know I can do this via regular expression regexpr("\\.([[:alnum:]]+)$", x)
, but wondering if there's a built-in funct开发者_运维知识库ion to deal with this?
This is the sort of thing that easily found with R basic tools. E.g.: ??path.
Anyway, load the tools
package and read ?file_ext
.
Let me extend a little bit great answer from https://stackoverflow.com/users/680068/zx8754
Here is the simple code snippet
# 1. Load library 'tools'
library("tools")
# 2. Get extension for file 'test.txt'
file_ext("test.txt")
The result should be 'txt'.
simple function with no package to load :
getExtension <- function(file){
ex <- strsplit(basename(file), split="\\.")[[1]]
return(ex[-1])
}
The regexpr above fails if the extension contains non-alnum (see e.g. https://en.wikipedia.org/wiki/List_of_filename_extensions) As an altenative one may use the following function:
getFileNameExtension <- function (fn) {
# remove a path
splitted <- strsplit(x=fn, split='/')[[1]]
# or use .Platform$file.sep in stead of '/'
fn <- splitted [length(splitted)]
ext <- ''
splitted <- strsplit(x=fn, split='\\.')[[1]]
l <-length (splitted)
if (l > 1 && sum(splitted[1:(l-1)] != '')) ext <-splitted [l]
# the extention must be the suffix of a non-empty name
ext
}
extract file extension only without dot:
tools::file_ext(fileName)
extract file extension with dot:
paste0(".", tools::file_ext(fileName))
If you don't want to use any additional package you could try
file_extension <- function(filenames) {
sub(pattern = "^(.*\\.|[^.]+)(?=[^.]*)", replacement = "", filenames, perl = TRUE)
}
If you like to be cryptic you could try to use it as a one-line expression: sub("^(.*\\.|[^.]+)(?=[^.]*)", "", filenames, perl = TRUE)
;-)
It works for zero (!), one or more file names (as character vector or list) with an arbitrary number of dots .
, and also for file names without any extension where it returns the empty character ""
.
Here the tests I tried:
> file_extension("simple.txt")
[1] "txt"
> file_extension(c("no extension", "simple.ext1", "with.two.ext2", "some.awkward.file.name.with.a.final.dot.", "..", ".", ""))
[1] "" "ext1" "ext2" "" "" "" ""
> file_extension(list("file.ext1", "one.more.file.ext2"))
[1] "ext1" "ext2"
> file_extension(NULL)
character(0)
> file_extension(c())
character(0)
> file_extension(list())
character(0)
By the way, tools::file_ext()
has trouble finding "strange" extensions with non-alphanumeric characters:
> tools::file_ext("file.zi_")
[1] ""
This function uses pipes:
library(magrittr)
file_ext <- function(f_name) {
f_name %>%
strsplit(".", fixed = TRUE) %>%
unlist %>%
extract(2)
}
file_ext("test.txt")
# [1] "txt"
Simplest way I've found with no additional packages:
FileExt <- function(filename) {
nameSplit <- strsplit(x = filename, split = "\\.")[[1]]
return(nameSplit[length(nameSplit)])
}
A way would be to use sub
.
s <- c("test.txt", "file.zi_", "noExtension", "with.two.ext2",
"file.with.final.dot.", "..", ".", "")
sub(".*\\.|.*", "", s, perl=TRUE)
#[1] "txt" "zi_" "" "ext2" "" "" "" ""
Assuming there is a dot - which will fail in case there is no extension:
sub(".*\\.", "", s)
#[1] "txt" "zi_" "noExtension" "ext2" ""
#[6] "" "" ""
For comparison tools::file_ext(s)
and the code with inside used regex.
tools::file_ext(s)
#[1] "txt" "" "" "ext2" "" "" "" ""
pos <- regexpr("\\.([[:alnum:]]+)$", s)
ifelse(pos > -1L, substring(s, pos + 1L), "")
#[1] "txt" "" "" "ext2" "" "" "" ""
精彩评论