C++, boost: which is fastest way to parse string like tcp://adr:port/ into address string and one int for port?
we have std::string A
with tcp://adr:port/
How to parse it into address std::string and one int fo开发者_如何转开发r port?
Although some wouldn't consider it particularly kosher C++, probably the easiest way would be to use sscanf:
sscanf(A.c_str(), "tcp://%[^:]:%d", &addr, &port);
Another possibility would be to put the string into a stringstream, imbue the stream with a facet that treats most alphabetic and punctuation as whitespace, and just read the address and port like:
std::istringstream buffer(A);
buffer.imbue(new numeric_only);
buffer >> addr >> port;
The facet would look something like this:
struct digits_only: std::ctype<char>
{
digits_only(): std::ctype<char>(get_table()) {}
static std::ctype_base::mask const* get_table()
{
// everything is white-space:
static std::vector<std::ctype_base::mask>
rc(std::ctype<char>::table_size,std::ctype_base::space);
// except digits, which are digits
std::fill(&rc['0'], &rc['9'], std::ctype_base::digit);
// and '.', which we'll call punctuation:
rc['.'] = std::ctype_base::punct;
return &rc[0];
}
};
operator>>
treats whitespace as separators between "fields", so this will treat something like 192.168.1.1:25
as two strings: "192.168.1.1" and "25".
void extract(std::string const& ip, std::string& address, std::string& service)
{
boost::regex e("tcp://(.+):(\\d+)/");
boost::smatch what;
if(boost::regex_match(ip, what, e, boost::match_extra))
{
boost::smatch::iterator it = what.begin();
++it; // skip the first entry..
address = *it;
++it;
service = *it;
}
}
EDIT: reason service is a string here is that you'll need it as a string for resolver! ;)
Fastest as in computer time or programmer time? I can't speak of benchmarks but the uri library in the cpp-netlib framework works very well and is very easy and straightforward to use.
http://cpp-netlib.github.com/0.8-beta/uri.html
You could use a tool like re2c to create a fast custom scanner. I'm also unclear on what you consider to be "fastest" -- for the processor or development time or both?
Nowadays one may also meet IPv6 addresses with a host part that already contains a variable number of colons and dots. Splitting URL's then should be done following RFC3986. See wikipedia IPv6
精彩评论