开发者

Create a safe, escaped path base/file name, check if safe

I wonder if there is a generic way to produce filesystem safe filenames that is portable. That is, I have a user entered string and would like to produce a file with a name that as closely resembles the name they have chosen. The resulting name must not include any path reference or other special file-system special name or tag.

Currently I just replace a bunch of known bad characters with other characters, or empty strings. For example, given the name ABC / DEF* : A Company? I'd produce the string ABC - DEF - A Company. My choice for replacement characters is totally arbitrary as I don't know of a generic escape symbol.

So my related questions are:

  1. Is there a method (perhaps in boost filesystem) that can tell me if the name refers strictly to a file without a path?
  2. Is there a function that tells me if the name is "safe" to use as a file (this may be an additional check from 1 for some filesystems)?
  3. Is there a function to convert a strin开发者_如何学JAVAg into a reasonable safe name?

Addtional Notes

For #1 I thought to just compare a boost path::filename() to the original object, if they are the same then I have a file. However this still allows things like '..' and '.' But that might be okay if there is a good solution for #2

In theory I'd have to provide a directory in which the file would reside, since different file-systems may have different requirements. But a global solution for the OS would also be okay.

I already have a function that just replaces a bunch of commonly known unsafe characters.

Common file dialogs cannot be used to do the filtering since the interface may not always allow them and in some cases the user isn't directly aware of the relationship to the file (advanced users would however).


According to POSIX fully portable filenames, the only portable filenames are those that contain only A–Za–z0–9._- and are max 14 characters long.

That said, a more practical approach is to assume that modern filesystems can cope with longer filenames and to simply replace all characters which are not explicitly marked as "safe" with _. Sometimes, instead of replacing with _, those characters are hex-encoded, like in URLs: sample%20file.txt. KDE applications use this, for example.

As for implementation, it's as simple as s/[^A-Za-z0-9.-]/_/.


How portable is portable? Many systems had limits on length, and some probably still do. Is disinguishing between names an issue? Some systems distinguish case, and others don't. What about a final .xxx? For some systems, it is significant, for others, it's just text.

Neglecting length, the safest bet is to take the opposite approach: create a set of known safe characters, and convert everything outside of that to a specific character. ASCII alphanumerics, and '_' seem pretty safe, and you're probably OK (today) with '-', but I doubt the list goes much further. And depending on what you're doing with these names, you might want to force them to a single case, either upper or lower.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜