开发者

Storing variable sized strings in structures

I'm reading a file in C++ using streams, specifically, fstream, not ifstream.

blah blah blah\n
blah blah\n
blah blah blah blah \n
end

This repeats over and over with

  1. varble number of blah's in each line,
  2. constant number of lines between each end, end is the delimiter here

I want to read one set of data, then store it in a character array, in a C style structure. I started by trying to use getline() but the delimter can only be one character, not three. I obviously can't try to read a set number of bytes using just read(), as the number will be different for each set.

So I'm torn over what the easiest (and most robust) thing to do here is. Should I call getline until I find an 'end' string, while appending each string over and over?

I tried a 2D char array but I copying to it was kind of a pain. Can I use strncpy here? I don't think this worked

char buf[10][10];
strncpy(buf[1], "blah blah",10);

I have a few ideas here, but I'm just not sure which one (or the one I haven't though of) is the best.

EDIT: So this is for a networking application, so the size of the char array (or string) should always be the same. Also, there should be no pointers in the structure.

Related que开发者_开发百科stion: is the way that a char array and a std::string are stored in memory the same? I always though there was some overhead with std::string.


Well, you said "in a C style structure", but perhaps you can just use std::string?

#include <fstream>
#include <iostream>
#include <string>
#include <vector>

int main(void)
{
    std::fstream file("main.cpp");
    std::vector<std::string> lines;

    std::string line;
    while (getline(file, line))
    {
        if (line == "end")
        {
            break;
        }

        std::cout << line << std::endl;
        lines.push_back(line);
    }

    // lines now has all the lines up-to
    // and not including "end"

/* this is for reading the file
end

some stuff that'll never get printed
or addded blah blah
*/
};


I'd recommend using strings instead of char arrays.


(My push_back utility described at the bottom.)

typedef std::vector<std::string> Block;

int main() {
  using namespace std;

  vector<Block> blocks;
  string const end = "end";

  // no real difference from using ifstream, btw
  for (fstream file ("filename", file.in); file;) {
    Block& block = push_back(blocks);
    for (string line; getline(file, line);) {
      if (line == end) {
        break;
      }
      push_back(block).swap(line);
    }
    if (!file && block.empty()) {
      // no lines read, block is a dummy not represented in the file
      blocks.pop_back();
    }
  }

  return 0;
}

Example serialization:

template<class OutIter>
void bencode_block(Block const& block, OutIter dest) {
  int len = 0;
  for (Block::const_iterator i = block.begin(); i != block.end(); ++i) {
    len += i->size() + 1; // include newline
  }
  *dest++ = len;
  *dest++ = ':';
  for (Block::const_iterator i = block.begin(); i != block.end(); ++i) {
    *dest++ = *i;
    *dest++ = '\n';
  }
}

I've used a simple bencoding serialization format. Example suitable output iterator, which just writes to a stream:

struct WriteStream {
  std::ostream& out;
  WriteStream(std::ostream& out) : out(out) {}

  WriteStream& operator++() { return *this; }
  WriteStream& operator++(int) { return *this; }
  WriteStream& operator*() { return *this; }

  template<class T>
  void operator=(T const& value) {
    out << value;
  }
};

Example use:

bencode_block(block, WriteStream(std::cout));

Another possible output iterator, which writes to a file descriptor (such as a network socket):

struct WriteFD {
  int out;
  WriteFD(int out) : out(out) {}

  WriteFD& operator++() { return *this; }
  WriteFD& operator++(int) { return *this; }
  WriteFD& operator*() { return *this; }

  template<class T>
  void operator=(T const& value) {
    if (write(value) == -1) {
      throw std::runtime_error(strerror(errno));
    }
  }

  //NOTE: write methods don't currently handle writing less bytes than provided
  int write(char value) {
    return write(out, &value, 1);
  }
  int write(std::string const& value) {
    return write(out, value.data(), value.size());
  }
  int write(int value) {
    char buf[20];
    // handles INT_MAX up to   9999999999999999999
    // handles INT_MIN down to -999999999999999999 
    // that's 19 and 18 nines, respectively (you did count, right? :P)
    int len = sprintf(buf, "%d", value);
    return write(out, buf, len);
  }
};

Poor man's move semantics:

template<class C>
typename C::value_type& push_back(C& container) {
  container.push_back(typename C::value_type());
  return container.back();
}

This allows easy use of move semantics to avoid unnecessary copies:

container.push_back(value); // copies
// becomes:
// (C is the type of container)
container.push_back(C::value_type()); // add empty
container.back().swap(value); // swap contents


This is really a parsing problem you are describing. Once you realise what the problem is, you are already most of the way to the solution.

It is tough to get more specific with you, as you don't really describe what you need done with the data. But typically you can do simple parsing inlne. In this case, perhaps you'd want a little routine that recognizes "blah" and EOL and "end", and tells you which it found at a given string location.

Then you can have a parse_line routine that recognizes an entire line (expecting any number of "blah"s ending with a EOL).

Then you can have a parse routine that calls parse_line your given number of times (10?), and then errors if "end" isn't found.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜