Regex to search html return, but not actual html jQuery

2023-02-17 19:57 问答作者：

I'm making a highlighting plugin for a client to find things in a page and I decided to test it with a help viewer im still building but I'm having an issue that'll (probably) require some regex.

I do not want to parse HTML, and im totally open on how to do this differently, this just seems like the the best/right way.

http://oscargodson.com/labs/help-viewer

http://oscargodson.com/labs/help-viewer/js/jquery.jhighlight.js

Type something in the search... ok, refresh the page, now type, like, class or class=" or type <a you'll notice it'll search the actual HTML (as expected). How can I only search the text?

If i do .text() it'll vaporize all the HTML and what i get back will just be a big blob of text, but i still want the HTML so I dont lose formatting, links, images, etc. I 开发者_开发知识库want this to work like CMD/CTRL+F.

You'd use this plugin like:

$('article').jhighlight({find:'class'});

To remove them:

.jhighlight('remove')

==UPDATE==

While Mike Samuel's idea below does in fact work, it's a tad heavy for this plugin. It's mainly for a client looking to erase bad words and/or MS Word characters during a "publishing" process of a form. I'm looking for a more lightweight fix, any ideas?

You really don't want to use eval, mess with innerHTML or parse the markup "manually". The best way, in my opinion, is to deal with text nodes directly and keep a cache of the original html to erase the highlights. Quick rewrite, with comments:

(function($){
  $.fn.jhighlight = function(opt) {

    var options = $.extend($.fn.jhighlight.defaults, opt)
      , txtProp = this[0].textContent ? 'textContent' : 'innerText';

    if ($.trim(options.find.length) < 1) return this;

    return this.each(function(){

      var self = $(this);

      // use a cache to clear the highlights
      if (!self.data('htmlCache'))
        self.data('htmlCache', self.html());

      if(opt === 'remove'){
        return self.html( self.data('htmlCache') );
      }

     // create Tree Walker
     // https://developer.mozilla.org/en/DOM/treeWalker
     var walker = document.createTreeWalker(
          this, // walk only on target element
          NodeFilter.SHOW_TEXT,
          null,
          false
      );

      var node
        , matches
        , flags = 'g' + (!options.caseSensitive ? 'i' : '')
        , exp = new RegExp('('+options.find+')', flags) // capturing
        , expSplit = new RegExp(options.find, flags) // no capturing
        , highlights = [];

      // walk this wayy
      // and save matched nodes for later
      while(node = walker.nextNode()){
        if (matches = node.nodeValue.match(exp)){
          highlights.push([node, matches]);
        }
      }

      // must replace stuff after the walker is finished
      // otherwise replacing a node will halt the walker
      for(var nn=0,hln=highlights.length; nn<hln; nn++){

        var node = highlights[nn][0]
          , matches = highlights[nn][1]
          , parts = node.nodeValue.split(expSplit) // split on matches
          , frag = document.createDocumentFragment(); // temporary holder

        // add text + highlighted parts in between
        // like a .join() but with elements :)
        for(var i=0,ln=parts.length; i<ln; i++){

          // non-highlighted text
          if (parts[i].length)
            frag.appendChild(document.createTextNode(parts[i]));

          // highlighted text
          // skip last iteration
          if (i < ln-1){
            var h = document.createElement('span');
            h.className = options.className;
            h[txtProp] = matches[i];
            frag.appendChild(h);
          }
        }
        // replace the original text node
        node.parentNode.replaceChild(frag, node);
      };

    });
  };

 $.fn.jhighlight.defaults = {
    find:'',
    className:'jhighlight',
    color:'#FFF77B',
    caseSensitive:false,
    wrappingTag:'span'
 };

})(jQuery);

If you're doing any manipulation on the page, you might want to replace the caching with another clean-up mechanism, not trivial though.

You can see the code working here: http://jsbin.com/anace5/2/

You also need to add display:block to your new html elements, the layout is broken on a few browsers.

In the javascript code prettifier, I had this problem. I wanted to search the text but preserve tags.

What I did was start with HTML, and decompose that into two bits.

The text content
Pairs of (index into text content where a tag occurs, the tag content)

So given

Lorem <b>ipsum</b>

I end up with

text = 'Lorem ipsum'
tags = [6, '<b>', 10, '</b>']

which allows me to search on the text, and then based on the result start and end indices, produce HTML including only the tags (and only balanced tags) in that range.

Have a look here: getElementsByTagName() equivalent for textNodes. You can probably adapt one of the proposed solutions to your needs (i.e. iterate over all text nodes, replacing the words as you go - this won't work in cases such as <tag>wo</tag>rd but it's better than nothing, I guess).

I believe you could just do:

$('#article :not(:has(*))').jhighlight({find : 'class'});

Since it grabs all leaf nodes in the article it would require valid xhtml, that is, it would only match link in the following example:

<p>This is some paragraph content with a <a href="#">link</a></p>

DOM traversal / selector application could slow things down a bit so it might be good to do:

article_nodes = article_nodes || $('#article :not(:has(*))');
article_nodes.jhighlight({find : 'class'});

May be something like that could be helpful

>+[^<]*?(s(<[\s\S]*?>)?e(<[\s\S]*?>)?e)[^>]*?<+

The first part >+[^<]*? finds > of the last preceding tag

The third part [^>]*?<+ finds < of the first subsequent tag

In the middle we have (<[\s\S]*?>)? between characters of our search phrase (in this case - "see").

After regular expression searching you could use the result of the middle part to highlight search phrase for user.

继续阅读：javascript jquery jquery-plugins plugins search

Regex to search html return, but not actual html jQuery

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？