开发者

Regex to match partial words (JavaScript)

I would like to craft a case-insensitive rege开发者_如何学Cx (for JavaScript) that matches street names, even if each word has been abbreviated. For example:

n univ av should match N University Ave

king blv should match Martin Luther King Jr. Blvd

ne 9th should match both NE 9th St and 9th St NE

Bonus points (JK) for a "replace" regex that wraps the matched text with <b> tags.


You got:

"n univ av"

You want:

"\bn.*\buniv.*\bav.*"

So you do:

var regex = new RegExp("n univ av".replace(/(\S+)/g, function(s) { return "\\b" + s + ".*" }).replace(/\s+/g, ''), "gi");

Voilà!

But I'm not done, I want my bonus points. So we change the pattern to:

var regex = new RegExp("n univ av".replace(/(\S+)/g, function(s) { return "\\b(" + s + ")(.*)" }).replace(/\s+/g, ''), "gi");

And then:

var matches = regex.exec("N University Ave");

Now we got:

  • matches[0] => the entire expression (useless)
  • matches[odd] => one of our matches
  • matches[even] => additional text not on the original match string

So, we can write:

var result = '';
for (var i=1; i < matches.length; i++)
{
  if (i % 2 == 1)
    result += '<b>' + matches[i] + '</b>';
  else
    result += matches[i];
}


function highlightPartial(subject, search) {
  var special = /([?!.\\|{}\[\]])/g;
  var spaces  = /^\s+|\s+/g;
  var parts   = search.split(" ").map(function(s) { 
    return "\b" + s.replace(spaces, "").replace(special, "\\$1");
  });
  var re = new RegExp("(" + parts.join("|") + ")", "gi");
  subject = subject.replace(re, function(match, text) {
    return "<b>" + text + "</b>";
  });
  return subject;
}

var result = highlightPartial("N University Ave", "n univ av");
// ==> "<b>N</b> <b>Univ</b>ersity <b>Av</b>e"

Side note - this implementation does not pay attention to match order, so:

var result = highlightPartial("N University Ave", "av univ n");
// ==> "<b>N</b> <b>Univ</b>ersity <b>Av</b>e"

If that's a problem, a more elaborate sequential approach would become necessary, something that I have avoided here by using a replace() callback function.


Simple:

var pattern = "n univ av".replace(/\s+/, "|");
var rx      = new RegExp(pattern, "gi");
var matches = rx.Matches("N University Ave");

Or something along these lines.


If these are your search terms:

  1. n univ av
  2. king blv
  3. ne 9th

It sounds like your algorithm should be something like this

  1. split search by space (results in search terms array) input.split(/\s+/)
  2. attempt to match each term within your input. /term/i
  3. for each matched input, replace each term with the term wrapped in <b> tags. input.replace(/(term)/gi, "<b>\$1</b>")

Note: You'll probably want to take precaution to escape regex metacharacters.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜