开发者

Getting parts of a file path

I have a file path, gotten from the __FILE__ macro, and I want to extract 2 pieces from it.

The format is: /some/path/to/a/file/AAA/xxx/BBB.cc. I want the AAA and BBB path. xxx is generally src, inc, tst, etc, and the file exten开发者_JAVA百科sion is generally .cc, but not guaranteed.

I know I can use string.find() or even splitting the string into an array on the / character, but neither seem efficient, given the number of searches that would be needed. I thought about sscanf and feel that is probably the best approach, however, I have not been able to define the format such that it will skip the majority of the beginning and get the pieces I need. How could I use sscanf to do this, or is there a better way?

Thanks for the help.


Use rfind, so that you can start at the end and work backwards:

string s = "/some/path/to/a/file/AAA/xxx/BBB.cc";

unsigned int a = s.rfind('.');
unsigned int b = s.rfind('/');
string BBB = s.substr(b+1,a-b-1);

a = s.rfind('/',b-1);
b = s.rfind('/',a-1);  
string AAA = s.substr(b+1,a-b-1);


  1. Get it right
  2. If it's not fast enough, improve it

It's easier to just write this yourself than try to get sscanf to do it. Your code will be easier to understand and quite a bit faster (but, I doubt that will matter).

Just loop from the back of the string. When you find the first dot, remember that location, then extract BBB when you find the first slash. Remember where the second slash is, and extract AAA when you find the third one.


A regular expression can do the trick:

#include <boost/regex.hpp>
#include <iostream>
#include <cstdlib>

int main() {
    std::string path("/some/path/to/a/file/AAA/xxx/BBB.cc");

    boost::regex path_re(".+/([^/]+)/[^/]+/([^.]+)\\.(.+?)", boost::regex::perl);
    boost::smatch m;
    if(regex_match(path, m, path_re)) {
        std::cout << "part 1 " << m[1] << '\n';
        std::cout << "part 2 " << m[2] << '\n';
        std::cout << "part 3 " << m[3] << '\n';
    }
    else {
        abort();
    }
}

Output:

part 1 AAA
part 2 BBB
part 3 cc

Note, that it doesn't handle non-canonical paths with /./ elements in it.


char *path = ... /* fill this however you like, for example function argument */
char *AAA_start, *AAA_end;
char *BBB_start, *BBB_end;
        // go the end of the string and find the first .
for (BBB_end = path+strlen(path); *BBB_end != '.'; --BBB_end);
        // continue to find the first /
for (BBB_start = BBB_end; *BBB_start != '/'; --BBB_start);
        // Now you have the beginning and end of BBB
        // continue from there to find next /
for (AAA_end = BBB_start-1; *AAA_end != '/'; --AAA_end);
        // continue from there to find next /
for (AAA_start = AAA_end-1; *AAA_start != '/'; --AAA_start);
        // Now you have the beginning and end of AAA

        // Now you can do whatever you want with AAA and BBB, for example
char *AAA = new char[AAA_end-AAA_start+2];  // AAA_end is included in the result
                                            // hence +1. Another +1 for the NULL
char *BBB = new char[BBB_end-BBB_start+2];
memcpy(AAA, AAA_start, AAA_end-AAA_start+1);
memcpy(BBB, BBB_start, BBB_end-BBB_start+1);
AAA[AAA_end-AAA_start+1] = NULL;
BBB[BBB_end-BBB_start+1] = NULL;

That was the basic idea. Now you need to add error checking to this:

char *path = ... /* fill this however you like, for example function argument */
char *AAA_start, *AAA_end;
char *BBB_start, *BBB_end;
for (BBB_end = path+strlen(path); *BBB_end != '.' && BBB_end != path; --BBB_end);
if (BBB_end == path) return FAIL;
for (BBB_start = BBB_end; *BBB_start != '/' && BBB_start != path; --BBB_start);
if (BBB_start == path) return FAIL;
for (AAA_end = BBB_start-1; *AAA_end != '/' && AAA_end != path; --AAA_end);
if (AAA_end == path) return FAIL;
for (AAA_start = AAA_end-1; *AAA_start != '/' && AAA_start != path; --AAA_start);
if (AAA_start == path && *AAA_start != '/') return FAIL;

char *AAA = new char[AAA_end-AAA_start+2];
char *BBB = new char[BBB_end-BBB_start+2];
memcpy(AAA, AAA_start, AAA_end-AAA_start+1);
memcpy(BBB, BBB_start, BBB_end-BBB_start+1);
AAA[AAA_end-AAA_start+1] = NULL;
BBB[BBB_end-BBB_start+1] = NULL;
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜