Regular expression to remove a file's extension
I am in need of 开发者_JAVA技巧a regular expression that can remove the extension of a filename, returning only the name of the file.
Here are some examples of inputs and outputs:
myfile.png -> myfile
myfile.png.jpg -> myfile.png
I can obviously do this manually (ie removing everything from the last dot) but I'm sure that there is a regular expression that can do this by itself.
Just for the record, I am doing this in JavaScript
Just for completeness: How could this be achieved without Regular Expressions?
var input = 'myfile.png';
var output = input.substr(0, input.lastIndexOf('.')) || input;
The || input
takes care of the case, where lastIndexOf()
provides a -1
. You see, it's still a one-liner.
/(.*)\.[^.]+$/
Result will be in that first capture group. However, it's probably more efficient to just find the position of the rightmost period and then take everything before it, without using regex.
The regular expression to match the pattern is:
/\.[^.]*$/
It finds a period character (\.), followed by 0 or more characters that are not periods ([^.]*), followed by the end of the string ($).
console.log(
"aaa.bbb.ccc".replace(/\.[^.]*$/,'')
)
/^(.+)(\.[^ .]+)?$/
Test cases where this works and others fail:
- ".htaccess" (leading period)
- "file" (no file extension)
- "send to mrs." (no extension, but ends in abbr.)
- "version 1.2 of project" (no extension, yet still contains a period)
The common thread above is, of course, "malformed" file extensions. But you always have to think about those corner cases. :P
Test cases where this fails:
- "version 1.2" (no file extension, but "appears" to have one)
- "name.tar.gz" (if you view this as a "compound extension" and wanted it split into "name" and ".tar.gz")
How to handle these is problematic and best decided on a project-specific basis.
/^(.+)(\.[^ .]+)?$/
Above pattern is wrong - it will always include the extension too. It's because of how the javascript regex engine works. The (\.[^ .]+)
token is optional so the engine will successfully match the entire string with (.+)
http://cl.ly/image/3G1I3h3M2Q0M
Here's my tested regexp solution.
The pattern will match filenameNoExt with/without extension in the path, respecting both slash and backslash separators
var path = "c:\some.path/subfolder/file.ext"
var m = path.match(/([^:\\/]*?)(?:\.([^ :\\/.]*))?$/)
var fileName = (m === null)? "" : m[0]
var fileExt = (m === null)? "" : m[1]
dissection of the above pattern:
([^:\\/]*?) // match any character, except slashes and colon, 0-or-more times,
// make the token non-greedy so that the regex engine
// will try to match the next token (the file extension)
// capture the file name token to subpattern \1
(?:\. // match the '.' but don't capture it
([^ :\\/.]*) // match file extension
// ensure that the last element of the path is matched by prohibiting slashes
// capture the file extension token to subpattern \2
)?$ // the whole file extension is optional
http://cl.ly/image/3t3N413g3K09
http://www.gethifi.com/tools/regex
This will cover all cases that was mentioned by @RogerPate but including full paths too
another no-regex way of doing it (the "oposite" of @Rahul's version, not using pop() to remove)
It doesn't require to refer to the variable twice, so it's easier to inline
filename.split('.').slice(0,-1).join()
This will do it as well :)
'myfile.png.jpg'.split('.').reverse().slice(1).reverse().join('.');
I'd stick to the regexp though... =P
return filename.split('.').pop();
it will make your wish come true. But not regular expression way.
In javascript you can call the Replace() method that will replace based on a regular expression.
This regular expression will match everything from the begining of the line to the end and remove anything after the last period including the period.
/^(.*)\..*$/
The how of implementing the replace can be found in this Stackoverflow question.
Javascript regex question
精彩评论