How can I search all the occurrence of a regex from a curl output in C++?
I would like to search for every IP address in a curl output. Is there a quick way to do so? I know about regex_search from boost, but from what I read, it is targeted for files.
My actual non-working code:
#include <iostream>
#include <curl/curl.h>
#include <boost/regex.hpp>
using namespace std;
boost::regex expression("\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}");
boost::smatch what; // "match" specialized for std::string of char
boost::match_flag_type flags = boost::match_default;
string buffer = "hey";
int writer(char *data, size_t size, size_t nmemb, string *buffer){
int result = 0;
if(buffer != NULL) {
开发者_高级运维 buffer -> append(data, size * nmemb);
result = size * nmemb;
}
return result;
}
int main(int argc, char *argv[]) {
CURL *curl;
CURLcode res;
curl = curl_easy_init();
if(curl) {
curl_easy_setopt(curl, CURLOPT_URL, "http://www.xroxy.com/proxylist.php");
curl_easy_setopt(curl, CURLOPT_FOLLOWLOCATION, 0); /* Don't follow anything else than the particular url requested*/
curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, writer); /* Function Pointer "writer" manages the required buffer size */
curl_easy_setopt(curl, CURLOPT_WRITEDATA, &buffer ); /* Data Pointer &buffer stores downloaded web content */
curl_easy_perform(curl);
/* always cleanup */
curl_easy_cleanup(curl);
}
if (boost::regex_search(buffer.begin(), buffer.end(), what, expression, flags) ) {
cout << "found: " << what << endl;
}
return 0;
}
boost::regex
can be used to search strings.
I don't know how do you acquire the output of curl
, but I suppose you can get it into a std::string, then you can just search it with boost.
std::string s(/*...*/);
boost::regex expression("[abc]{5}"); // just an example
boost::smatch what; // "match" specialized for std::string of char
boost::match_flag_type flags = boost::match_default;
if ( boost::regex_search(s.begin(), s.end(), what, expression, flags) ) {
cout << "found: " << what << endl;
}
boost::smatch
is a very versatile and useful class, it can give you the whole match as a std::string, iterators to its start and end, as well as for each subgroup in your regex
regex_search from boost, but from what I read, it is targeted for files.
Are you sure? From what I see here: http://www.boost.org/doc/libs/1_45_0/libs/regex/doc/html/boost_regex/ref/regex_search.html
Determines whether there is some sub-sequence within [first,last) that matches the regular expression e, parameter flags is used to control how the expression is matched against the character sequence. Returns true if such a sequence exists, false otherwise.
Am I missing something here?
Anyway, another option would be to filter curl's output through grep.
精彩评论