Simple way to split a sequence of null-separated strings in C++
I have a series of strings stored in a single array, separated by nulls (for example ['f', 'o', 'o', '\0', 'b', 'a', 'r', '\0'...]), and I need to split this into a std::vector<std::string>
or similar.
I could just write a 10-line loop to do this using std::find
or strlen
(in fact I just did), but I'm wondering if there is a simpler/more elegant way to do it, for example some STL algorithm I've overlooked, which can be coaxed into doing this.
It is a fairly simple task, and it wouldn't surprise me if there's some clever STL trickery that can be applied to 开发者_如何学运维make it even simpler.
Any takers?
My two cents :
const char* p = str;
std::vector<std::string> vector;
do {
vector.push_back(std::string(p));
p += vector.back().size() + 1;
} while ( // whatever condition applies );
Boost solution:
#include <boost/algorithm/string.hpp>
std::vector<std::string> strs;
//input_array must be a Range containing the input.
boost::split(
strs,
input_array,
boost::is_any_of(boost::as_array("\0")));
The following relies on std::string
having an implicit constructor taking a const char*
, making the loop a very simple two-liner:
#include <iostream>
#include <string>
#include <vector>
template< std::size_t N >
std::vector<std::string> split_buffer(const char (&buf)[N])
{
std::vector<std::string> result;
for(const char* p=buf; p!=buf+sizeof(buf); p+=result.back().size()+1)
result.push_back(p);
return result;
}
int main()
{
std::vector<std::string> test = split_buffer("wrgl\0brgl\0frgl\0srgl\0zrgl");
for (auto it = test.begin(); it != test.end(); ++it)
std::cout << '"' << *it << "\"\n";
return 0;
}
This solution assumes the buffer's size is known and the criterion for the end of the list of strings. If the list is terminated by "\0\0"
instead, the condition in the loop needs to be changed from p!=foo+sizeof(foo)
to *p
.
A more elegant and actual solution (compared to my other answer) uses getline and boils down to 2 lines with only C++2003, and no manual loop bookkeeping and conditioning is required:
#include <iostream>
#include <sstream>
#include <string>
int main() {
const char foo[] = "meh\0heh\0foo\0bar\0frob";
std::istringstream ss (std::string(foo, foo + sizeof foo));
std::string str;
while (getline (ss, str, '\0'))
std::cout << str << '\n';
}
However, note how the range based string constructor already indicates an inherent problem with splitting-at-'\0's: You must know the exact size, or find some other char-combo for the Ultimate Terminator.
Here's the solution I came up with myself, assuming the buffer ends immediately after the last string:
std::vector<std::string> split(const std::vector<char>& buf) {
auto cur = buf.begin();
while (cur != buf.end()) {
auto next = std::find(cur, buf.end(), '\0');
drives.push_back(std::string(cur, next));
cur = next + 1;
}
return drives;
}
A bad answer, actually, but I doubted your claim of a 10 line loop for manual splitting. 4 Lines do it for me:
#include <vector>
#include <iostream>
int main() {
using std::vector;
const char foo[] = "meh\0heh\0foo\0bar\0frob";
vector<vector<char> > strings(1);
for (const char *it=foo, *end=foo+sizeof(foo); it!=end; ++it) {
strings.back().push_back(*it);
if (*it == '\0') strings.push_back(vector<char>());
}
std::cout << "number of strings: " << strings.size() << '\n';
for (vector<vector<char> >::iterator it=strings.begin(), end=strings.end();
it!=end; ++it)
std::cout << it->data() << '\n';
}
In C, string.h has this guy:
char * strtok ( char * str, const char * delimiters );
the example on cplusplus.com :
/* strtok example */
#include <stdio.h>
#include <string.h>
int main ()
{
char str[] ="- This, a sample string.";
char * pch;
printf ("Splitting string \"%s\" into tokens:\n",str);
pch = strtok (str," ,.-");
while (pch != NULL)
{
printf ("%s\n",pch);
pch = strtok (NULL, " ,.-");
}
return 0;
}
It's not C++, but it will work
精彩评论