how to extract all the names in a text using regular expression
English names are like the format Harry Potter, one First name the other Last name. But how to extract this kind of pattern using C++开发者_开发知识库?
Well, a very simple regex would be /\b([A-Z][a-z]+) ([A-Z][a-z]+)\b/
.
EDIT: This does not handle odd capitalisation and stray apostrophes.
EDIT: Removed ^
and $
, placed word boundaries.
you can start from something like this.
#include<regex>
#include<iostream>
int main()
{
// regular expression
const std::regex pattern("([A-Z][a-z]+)\s([A-Z][a-z]+)");
// the source text
std::string text = "string containing names ...";
const std::sregex_token_iterator end;
for (std::sregex_token_iterator i(text.cbegin(), text.cend(), pattern);
i != end;
++i)
{
std::cout << *i << std::endl;
}
return 0;
}
learning regex helps.
精彩评论