Regex to check if a path go only down
I want to test if a path given by the user go down like:
my/down/path
at the opposite of:
this/path/../../go/up
for security reasons.
I already made this:
return (bool)preg_match('#^([a-z0-9_-])+(\/[a-z0-9_-])*$#i', $fieldValue);
But the user should be allowed to use the '.'
in his path (like: my/./path
, that not useful but he can) and I don't know how to consider it.
I'm then looking for a secure regex to check this.
Thanks
edi开发者_StackOverflowt: After viewing answers, yes it would be fine if the test check if the real path (removing '.'
and '..'
) is a down path.
You can simply check if the realpath of the user supplied path begins with the allowed path:
function isBelowAllowedPath($allowedPath, $pathToCheck)
{
return strpos(
realpath($allowedPath . DIRECTORY_SEPARATOR . $pathToCheck),
realpath($allowedPath)
) === 0;
}
Demo on codepad
Note that this will also return false
for directories that do not exist below $allowedPath
.
You probably do not want to check that a path doesn't contain ..
but instead want to check that if evaluated as whole, it doesn't go up. E.g. ./path/..
is still in .
, even though it contains ..
.
You can find an implementation of path depth validation in Twig:
$parts = preg_split('#[\\\\/]+#', $name);
$level = 0;
foreach ($parts as $part) {
if ('..' === $part) {
--$level;
} elseif ('.' !== $part) {
++$level;
}
if ($level < 0) {
return false;
}
}
return true;
Twig does not use realpath
for the validation, because realpath
has issues with paths in Phar archives. Additionally realpath
only works if the pathname already exists.
The previous responses (including the accepted one) address path depth but not path traversal. Since the question specifically mentioned that this is for security, then casual checking such as that described so far may not be sufficient.
For example,
- Do you care about traversing through hard or soft links below the current working directory?
- Does the system you are on (or could potentially deploy to) support unicode?
- How many things are evaluating the string in question before or after your PHP code sees it? The web server? The shell? Something else?
Suppose I send your script a string like ./..%2f../
? Is it important to your application that this string will take me up two levels? Or that the scripts provided in other answers will not catch this because it doesn't evaluate to ..
?
What about ./\.\./
? If the path is parsed by splitting on both \
and /
the script in the accepted answer won't catch it because each part will look like .
which is simply the current directory. But a typical UNIX shell treats the \
as an escape character so passing it ./\.\./
is equivalent to ./../
and so the attacker can exploit the fact that the script combines tests for UNIX and Windows style paths.
If by "security" you really mean you want to provide protection against casual mistakes and typos, then the other answers are probably sufficient. If you are programming for a hostile environment and want to prevent breaches from deliberate attacks then they barely scratch the surface and you would be well advised to go read up on secure programming at OWASP. I would start with their articles on Path Traversal and then read up on the other attacks they outline as well as how to avoid them and, more importantly, how to test for them.
$folders = $explode('/', $path);
if (in_array('..', $folders)) {
print('Error: path contains ..');
}
If you only want to restrict the user from going up in the path hierarchy, you can explicitly search for '..':
if (1 === preg_match('/\.\./', $path)) {
/* path contains .. */
}
Which is also quicker than explode and in_array.
Benchmarking:
<?php
$attempts = 100000;
$path = 'my/path/with/../invalid';
$t = microtime(true);
for ($i = 0; $i < $attempts; ++$i) {
$folders = explode('/', $path);
if (in_array('..', $folders)) {
/* .. in path */ ;
}
}
$end = microtime(true);
printf("in_array: %f\n", $end - $t);
$t = microtime(true);
for ($i = 0; $i < $attempts; ++$i) {
if (1 === preg_match('/\.\./', $path)) {
/* .. in path */ ;
}
}
$end = microtime(true);
printf("preg_match: %f\n ", $end - $t);
in_array: 0.088750
preg_match: 0.071547
精彩评论