开发者

Sort C++ Strings with multiple criteria

I need to sort a C++ std::vector<std::string> fileNames. The fileNames are labeled as such

YYDDDTTTT_Z_SITE

YY = Year (i.e 2009 = 09, 2010 = 10) DDD = Day of the year (i.e 1 January = 001, 31 December = 365) TTTT = Time of the day (i.e midnight = 0000, noon = 1200)

ZONE = Will be either E or W

SITE = Four letter开发者_开发百科 site name (i.e HILL, SAMM)

I need the strings to be sorted by the following order: ZONE, SITE, YY, DDD, TTTT


Use std::sort with a comparison function.

(The link has a nice example)


The easy part: write the sort itself:

// Return true if the first arg is strictly less than the second
bool compareFilenames(const std::string& rhs, const std::string& lhs);
...
std::sort(fileNames.begin(), fileNames.end(), &compareFilenames);

The harder part: writing the comparison itself. In pseudocode, for full generality:

bool compareFilenames(const std::string& lhs, const std::string& rhs)
{
    parse the filenames
    if (lhs zone != rhs zone)
        return lhs zone < rhs zone
    if (lhs site != rhs site)
        return lhs site < rhs site
    ...
    return false
}

where lhs site, etc. are the individual bits of data you need to sort by, picked out of the filename.

Given the strict file naming structure you have, though, and your specific sorting needs, you can actually get away with just splitting the string by the first '_' character and doing a lexicographical compare of the second chunk, followed the first chunk if the second chunk is equal. That will make the code to parse the filename much easier, at the potential cost of flexibility if the file naming format ever changes.


Just write a method that will compare two filenames based upon your criteria, to determine which one comes first then use any standard sorting method.


Use std::sort and implement a Compare Class

look into http://www.cplusplus.com/reference/stl/list/sort/ for further details


Your sort predicate which you pass to vector::sort() may create reordered temporary strings of the string which it then compares.


You could use qsort with your own string compare function that takes into account your sorting rules, and the address of the first element in each vector where it asks for an array.
http://www.cplusplus.com/reference/clibrary/cstdlib/qsort/

But you shouldn't. Just use std::sort


Here's a boost lambda functions version. This is overkill, and pretty cryptic, but it's brief and flexible in terms of how one can juggle with different fields criteria. Obviously you need boost. Also, expect increased compilation time. So, here it is:

#include <boost/lambda/lambda.hpp>
#include <boost/lambda/bind.hpp>
#include "boost/lambda/detail/operator_actions.hpp"
#include "boost/lambda/detail/operator_return_type_traits.hpp"
#include "boost/lambda/detail/control_structures_impl.hpp"
#include "boost/ref.hpp"
#include <iostream>
#include <vector>
#include <string>
#include <iterator>
#include <algorithm>
#include <cassert>

using namespace std;
using namespace boost::lambda;

//helpers: a better way would be to group them
//under a flyweight, or something...
string extract_year(string str_)
{
    return str_.substr(0,2);
}

string extract_dayofyear(string str_)
{
    return str_.substr(2,3);
}

string extract_timeofday(string str_)
{
    return str_.substr(5,4);
}

string extract_zone(string str_)
{
    return str_.substr(10,1);
}

string extract_site(string str_)
{
    return str_.substr(12,4);
}

//Uhm, just for brevity... ('cause otherwise we should stay away from macros ;-)
#define IF_THEN_ELSE_RET(op1,op2,exp) if_then_else_return(var(op1)<var(op2),true,if_then_else_return(var(op1)>var(op2),false,exp))

void sort_fnames(vector<string>& fnames)
{
    string z1,z2,s1,s2,y1,y2,d1,d2,t1,t2;

    //sort by zone-then-site-then-year-then-day-then-time:
    //Note the format of the sort(fnames.begin(),fnames.end(), (,...,boolean_expression) );
    //remember, in a sequence of comma-dellimited statements enclosed between parens, like
    //val=(.,...,boolean_expression); only the last expression, boolean_expression, gets
    //assigned to variable "val";
    //So, in the sort() call below, the statements 
    //var(z1)=bind(extract_zone,_1),var(z2)=bind(extract_zone,_2), etc.
    //are only initializing variables that are to be used in the composition 
    //of if_then_else_return(,,) lambda expressions whose composition 
    //combines the zone-then-site-then-year-then-day-then-time criteria 
    //and amounts to a boolean that is used by sort to decide the ordering
    sort(fnames.begin(),fnames.end(),
        (var(z1)=bind(extract_zone,_1),var(z2)=bind(extract_zone,_2),
         var(s1)=bind(extract_site,_1),var(s2)=bind(extract_site,_2),
         var(y1)=bind(extract_year,_1),var(y2)=bind(extract_year,_2),
         var(d1)=bind(extract_dayofyear,_1),var(d2)=bind(extract_dayofyear,_2),
         var(t1)=bind(extract_timeofday,_1),var(t2)=bind(extract_timeofday,_2),
         IF_THEN_ELSE_RET(z1,z2,IF_THEN_ELSE_RET(s1,s2,IF_THEN_ELSE_RET(y1,y2,IF_THEN_ELSE_RET(d1,d2,IF_THEN_ELSE_RET(t1,t2,true)))))
         ));
}
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜