开发者

how to extract all the names in a text using regular expression

English names are like the format Harry Potter, one First name the other Last name. But how to extract this kind of pattern using C++开发者_开发知识库?


Well, a very simple regex would be /\b([A-Z][a-z]+) ([A-Z][a-z]+)\b/.

EDIT: This does not handle odd capitalisation and stray apostrophes.

EDIT: Removed ^ and $, placed word boundaries.


you can start from something like this.

#include<regex>
#include<iostream>
int main()
{
   // regular expression
   const std::regex pattern("([A-Z][a-z]+)\s([A-Z][a-z]+)");

   // the source text
   std::string text = "string containing names ...";

   const std::sregex_token_iterator end;
   for (std::sregex_token_iterator i(text.cbegin(), text.cend(), pattern);
        i != end;
        ++i)
   {
      std::cout << *i << std::endl;
   }

   return 0;
}

learning regex helps.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜