Escape string for Process.Start
How can I escape an unknown string for passing to Process.Start as an argument?
I currently escape basic quotes and backslashes, but recently my input has started to contain things like http://www.fileformat.info/info/unicode/char/ff02/index.htm (Fullwidth quotation mark).
So my question is, what all do I need to escape to safely pass a string as an argument for Process.Start?
Edit: So I need to clarify this. What I really am looking for is a list of all characters that have to be escaped in a quoted string ("foo") for cmd.exe. I originally dealt with double quote character as well as backslash character, but I finally had some input that contained a fullwidth quotation mark (as referenced above) which also needed to be escaped. So the question is, what else do I need to escape for a quoted string argume开发者_高级运维nt passed to cmd.exe with Process.Start?
This might be useful:
First, multiple arguments are normally separated from one another by spaces. In Figure 2.3, the command has three arguments, c:*.bak, e:\backup, and /s. Occasionally, other characters are used as argument separators. For example, the COPY command can use + characters to separate multiple filenames.
Second, any argument that contains spaces or begins or ends with spaces must be enclosed in double quotes. This is particularly important when using long file and directory names, which frequently contain one or more spaces. If a double-quoted argument itself contains a double quote character, the double quote must be doubled. For example, enter "Quoted" Argument as """Quoted"" Argument".
Third, command switches always begin with a slash / character. A switch is an argument that modifies the operation of the command in some way. Occasionally, switches begin with a + or - character. Some switches are global, and affect the command regardless of their position in the argument list. Other switches are local, and affect specific arguments (such as the one immediately preceding the switch).
Fourth, all reserved shell characters not in double quotes must be escaped. These characters have special meaning to the Windows NT command shell. The reserved shell characters are:
& | ( ) < > ^
To pass reserved shell characters as part of an argument for a command, either the entire argument must be enclosed in double quotes, or the reserved character must be escaped. Prefix a reserved character with a carat (^) character to escape it. For example, the following command example will not work as expected, because < and > are reserved shell characters:
1. C:\>echo <dir> 2. The syntax of the command is incorrect. Instead, escape the two reserved characters, as follows: 1. C:\>echo ^<dir^> 2. <dir>
Typically, the reserved shell characters are not used in commands, so collisions that require the use of escapes are rare. They do occur, however. For example, the popular PKZIP program supports a -& switch to enable disk spanning. To use this switch correctly under Windows NT, -^& must be typed.
Tip: The carat character is itself a reserved shell character. Thus, to type a carat character as part of a command argument, type two carats instead. Escaping is necessary only when the normal shell interpretation of reserved characters must be bypassed.
- Finally, the maximum allowed length of a shell command appears to be undocumented by Microsoft. Simple testing shows that the Windows NT command shell allows very long commands—in excess of 4,000 characters. Practically speaking, there is no significant upper limit to the length of a command.
http://technet.microsoft.com/en-us/library/cc723564.aspx
This answer is the nearest I've seen to explaining the craziness of Windows command line arguments. It is not as simple as it initially looks.
You shouldn't need to escape U+FF02 in general. The problem is that if you end up passing that character to a command line that doesn't support Unicode, it'll get munged down to its non-compatibility equivalent, the ASCII quote, at which point it becomes dangerous. If your command is going to a tool that doesn't support Unicode, you should fold it down to ASCII before applying the argument escaping, rather than letting the tool at the other end do it.
(Usually, the problem will be when that tool uses the C stdlib to read its arguments. This is defined in terms of 8-bit char
; the Windows stdlib implementation uses the default (“ANSI”) system codepage to encode the string to 8-bit, and that codepage is never a Unicode Transformation Format so you'll always lose characters.)
My attempt at escaping:
public static string QuoteArgument(string arg)
{
// The inverse of http://msdn.microsoft.com/en-us/library/system.environment.getcommandlineargs.aspx
// Suppose we wish to get after unquoting: \\share\"some folder"\
// We should provide: "\\share\\\"some folder\"\\"
// Escape quotes ==> \\share\\\"some folder\"\
// For quotes with N preceding backslashes, replace with 2k+1 preceding backslashes.
var res = new StringBuilder();
// For sequences of backslashes before quotes:
// odd ==> 2x+1, even => 2x ==> "\\share\\\"some folder"
var numBackslashes = 0;
for (var i = 0; i < arg.Length; ++i)
{
if(arg[i] == '"')
{
res.Append('\\', 2 * numBackslashes + 1);
res.Append('"');
numBackslashes = 0;
}
else if(arg[i] == '\\')
{
numBackslashes++;
}
else
{
res.Append('\\', numBackslashes);
res.Append(arg[i]);
numBackslashes = 0;
}
}
res.Append('\\', numBackslashes);
// Enquote, doubling last sequence of backslashes ==> "\\share\\\"some folder\"\\"
var numTrailingBackslashes = 0;
for (var i = res.Length - 1; i > 0; --i)
{
if (res[i] != '\\')
{
numTrailingBackslashes = res.Length - 1 - i;
break;
}
}
res.Append('\\', numTrailingBackslashes);
return '"' + res.ToString() + '"';
}
This is an example of how a quote can be present in a google search string:
string link = @"https://www.google.com/search?q=\""Hello+World!\""";
Process.Start("CHROME.EXE", link);
精彩评论