Convert HTML Character Entities back to regular text using javascript

2023-01-28 09:06 问答作者：

the questions says it all :)

eg. we have >, we need > using only javascript

Update: It seems jquery is the easy way o开发者_JAVA技巧ut. But, it would be nice to have a lightweight solution. More like a function which is capable to do this by itself.

You could do something like this:

String.prototype.decodeHTML = function() {
    var map = {"gt":">" /* , … */};
    return this.replace(/&(#(?:x[0-9a-f]+|\d+)|[a-z]+);?/gi, function($0, $1) {
        if ($1[0] === "#") {
            return String.fromCharCode($1[1].toLowerCase() === "x" ? parseInt($1.substr(2), 16)  : parseInt($1.substr(1), 10));
        } else {
            return map.hasOwnProperty($1) ? map[$1] : $0;
        }
    });
};

function decodeEntities(s){
    var str, temp= document.createElement('p');
    temp.innerHTML= s;
    str= temp.textContent || temp.innerText;
    temp=null;
    return str;
}

alert(decodeEntities('&lt;'))

/*  returned value: (String)
<
*/

I know there are libraries out there, but here are a couple of solutions for browsers. These work well when placing html entity data strings into human editable areas where you want the characters to be shown, such as textarea's or input[type=text].

I add this answer as I have to support older versions of IE and I feel that it wraps up a few days worth of research and testing. I hope somebody finds this useful.

First this is for more modern browsers using jQuery, Please note that this should NOT be used if you have to support versions of IE before 10 (7, 8, or 9) as it will strip out the newlines leaving you with just one long line of text.

if (!String.prototype.HTMLDecode) {
    String.prototype.HTMLDecode = function () {
            var str = this.toString(),
            $decoderEl = $('<textarea />');

        str = $decoderEl.html(str)
            .text()
            .replace(/<br((\/)|( \/))?>/gi, "\r\n");

        $decoderEl.remove();

        return str;
    };
}

This next one is based on kennebec's work above, with some differences which are mostly for the sake of older IE versions. This does not require jQuery, but does still require a browser.

if (!String.prototype.HTMLDecode) {
    String.prototype.HTMLDecode = function () {
        var str = this.toString(),
            //Create an element for decoding            
            decoderEl = document.createElement('p');

        //Bail if empty, otherwise IE7 will return undefined when 
        //OR-ing the 2 empty strings from innerText and textContent
        if (str.length == 0) {
            return str;
        }

        //convert newlines to <br's> to save them
        str = str.replace(/((\r\n)|(\r)|(\n))/gi, " <br/>");            

        decoderEl.innerHTML = str;
        /*
        We use innerText first as IE strips newlines out with textContent.
        There is said to be a performance hit for this, but sometimes
        correctness of data (keeping newlines) must take precedence.
        */
        str = decoderEl.innerText || decoderEl.textContent;

        //clean up the decoding element
        decoderEl = null;

        //replace back in the newlines
        return str.replace(/<br((\/)|( \/))?>/gi, "\r\n");
    };
}

/* 
Usage: 
    var str = "&gt;";
    return str.HTMLDecode();

returned value: 
    (String) >    
*/

Here is a "class" for decoding whole HTML document.

HTMLDecoder = {
    tempElement: document.createElement('span'),
    decode: function(html) {
        var _self = this;
        html.replace(/&(#(?:x[0-9a-f]+|\d+)|[a-z]+);/gi,
            function(str) {
                _self.tempElement.innerHTML= str;
                str = _self.tempElement.textContent || _self.tempElement.innerText;
                return str;
            }
        );
    }
}

Note that I used Gumbo's regexp for catching entities but for fully valid HTML documents (or XHTML) you could simpy use /&[^;]+;/g.

There is nothing built in, but there are many libraries that have been written to do this.

Here is one.

And here one that is a jQuery plugin.

继续阅读：character-entities javascript

Convert HTML Character Entities back to regular text using javascript

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？