开发者

Using stringstream instead of `sscanf` to parse a fixed-format string

I would like to use the facilities provided by stringstream to extract values from a fixed-format string as a type-safe alternative to sscanf. How can I do this?

Consider the following specific use case. I have a std::string in the following fixed format:

YYYYMMDDHHMMSSmmm

Where:

YYYY = 4 digits representing the year
MM = 2 digits representing the month ('0' padded to 2 characters)
DD = 2 digits representing the day ('0' padded to 2 characters)
HH = 2 digits representing the hour ('0' padded to 2 characters)
MM = 2 digits representing the minute ('0' padded to 2 characters)
SS = 2 digits representing the second ('0' padded to 2 characters)
mmm = 3 digits representing the milliseconds ('0' padded to 3 characters)

Previously I was doing something along these lines:

string s = "20101220110651184";
unsigned year = 0, month = 0, day = 0, hour = 0, minute = 0, second = 0, milli = 0;    
sscanf(s.c_str(), "%4u%2u%2u%2u%2u%2u%3u", &year, &month, &day, &hour, &minute, &second, &milli );

The width values are magic numbers, and that's ok. I'd like to use streams to extract these values and convert them to unsigneds in the interest of type safety. But when I try this:

开发者_如何学C
stringstream ss;
ss << "20101220110651184";
ss >> setw(4) >> year;

year retains the value 0. It should be 2010.

How do I do what I'm trying to do? I can't use Boost or any other 3rd party library, nor can I use C++0x.


One not particularly efficient option would be to construct some temporary strings and use a lexical cast:

std::string s("20101220110651184");
int year = lexical_cast<int>(s.substr(0, 4));
// etc.

lexical_cast can be implemented in just a few lines of code; Herb Sutter presented the bare minimum in his article, "The String Formatters of Manor Farm."

It's not exactly what you're looking for, but it's a type-safe way to extract fixed-width fields from a string.


Erm, if it's fixed format, why don't you do this?

  std::string sd("20101220110651184");
  // insert spaces from the back
  sd.insert(14, 1, ' ');
  sd.insert(12, 1, ' ');
  sd.insert(10, 1, ' ');
  sd.insert(8, 1, ' ');
  sd.insert(6, 1, ' ');
  sd.insert(4, 1, ' ');
  int year, month, day, hour, min, sec, ms;
  std::istringstream str(sd);
  str >> year >> month >> day >> hour >> min >> sec >> ms;


I use the following, it might be useful for you:

template<typename T> T stringTo( const std::string& s )
   {
      std::istringstream iss(s);
      T x;
      iss >> x;
      return x;
   };

template<typename T> inline std::string toString( const T& x )
   {
      std::ostringstream o;
      o << x;
      return o.str();
   }

These templates require:

#include <sstream>

Usage

long date;
date = stringTo<long>( std::cin );

YMMV


From here, you might find this useful:

template<typename T, typename charT, typename traits>
std::basic_istream<charT, traits>&
  fixedread(std::basic_istream<charT, traits>& in, T& x)
{
  if (in.width(  ) == 0)
    // Not fixed size, so read normally.
    in >> x;
  else {
    std::string field;
    in >> field;
    std::basic_istringstream<charT, traits> stream(field);
    if (! (stream >> x))
      in.setstate(std::ios_base::failbit);
  }
  return in;
}

setw() only applies to reading in of strings cstrings. The above function use this fact, reading into a string and then casting it to the required type. You can use it in combination with setw() or ss.width(w) to read in a fixed-width field of any type.


template<typename T>
struct FixedRead {
    T& content;
    int size;
    FixedRead(T& content, int size) :
            content(content), size(size) {
        assert(size != 0);
    }
    template<typename charT, typename traits>
    friend std::basic_istream<charT, traits>&
    operator >>(std::basic_istream<charT, traits>& in, FixedRead<T> x) {
        int orig_w = in.width();
        std::basic_string<charT, traits> o;
        in >> setw(x.size) >> o;
        std::basic_stringstream<charT, traits> os(o);
        if (!(os >> x.content))
            in.setstate(std::ios_base::failbit);
        in.width(orig_w);
        return in;
    }
};

template<typename T>
FixedRead<T> fixed_read(T& content, int size) {
    return FixedRead<T>(content, size);
}

void test4() {
    stringstream ss("20101220110651184");
    int year = 0, month = 0, day = 0, hour = 0, min = 0, sec = 0, ms = 0;
    ss >> fixed_read(year, 4) >> fixed_read(month, 2) >> fixed_read(day, 2)
            >> fixed_read(hour, 2) >> fixed_read(min, 2) >> fixed_read(sec, 2)
            >> fixed_read(ms, 4);
    cout << "year:" << year << "," << "month:" << month << "," << "day:" << day
            << "," << "hour:" << hour << "," << "min:" << min << "," << "sec:"
            << sec << "," << "ms:" << ms << endl;
}


The solution of ps5mh is really nice, but does not work for fixed-size parsing of strings that include white spaces. The following solution fixes this:

template<typename T, typename T2>
struct FixedRead
{
    T& content;
    T2& number;
    int size;
    FixedRead(T& content, int size, T2 & number) :
        content(content), number(number), size(size)
    {
        assert (size != 0);
    }
    template<typename charT, typename traits>
    friend std::basic_istream<charT, traits>&
    operator >>(std::basic_istream<charT, traits>& in, FixedRead<T,T2> x)
    {
        if (!in.eof() && in.good())
        {
            std::vector<char> buffer(x.size+1);
            in.read(buffer.data(), x.size);
            int num_read = in.gcount();
            buffer[num_read] = 0; // set null-termination of string
            std::basic_stringstream<charT, traits> os(buffer.data());
            if (!(os >> x.content))
                in.setstate(std::ios_base::failbit);
            else
                ++x.number;
        }
        return in;
    }
};
template<typename T, typename T2>
FixedRead<T,T2> fixedread(T& content, int size, T2 & number) {
    return FixedRead<T,T2>(content, size, number);
}

This can be used as:

std::string s  = "90007127       19000715790007397";
std::vector<int> ints(5);
int num_read = 0;
std::istringstream in(s);
in >> fixedread(ints[0], 8, num_read) 
   >> fixedread(ints[1], 8, num_read) 
   >> fixedread(ints[2], 8, num_read) 
   >> fixedread(ints[3], 8, num_read) 
   >> fixedread(ints[4], 8, num_read);
// output: 
//   num_read = 4 (like return value of sscanf)
//   ints = 90007127, 1, 90007157, 90007397
//   ints[4] is uninitialized
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜