Django syntax highlighting causing character escaping issues

2022-12-09 08:31 问答作者：

I've been working on my own django based blog (like everyone, I know) to sharpen up my python, and I thought added some syntax highlight would be pretty great. I looked at some of the snippets out there and decided to combine a few and write my own syntax highlighting template filter using Beautiful Soup and Pygments. It looks like this:

from django import template
from BeautifulSoup import BeautifulSoup
import pygments
import pygments.lexers as lexers
import pygments.formatters as formatters

register = template.Library()

@register.filter(name='pygmentize')
def pygmentize(value):
    try:
        formatter = formatters.HtmlFormatter(style='trac')
        tree = BeautifulSoup(value)
        for code in tree.findAll('code'):
            if not code['class']: code['class'] = 'text'
            lexer = lexers.get_lexer_by_name(code['class'])
            new_content = pygments.highlight(code.contents[0], lexer, formatter)
            new_content += u"<style>%s</style>" % formatter.get_style_defs('.highlight')
            code.replaceWith ( "%s\n" % new_content )
   开发者_运维百科     content = str(tree)
        return content
    except KeyError:
        return value

It looks for a code block like this and highlights and ads the relevant styles:

<code class="python">
    print "Hello World"
</code>

This was all working fine until a block of code I was included had some html in it. Now, I know all the html I need, so I write my blog posts directly in it and when rendering to the template, just mark the post body as safe:

{{ post.body|pygmentize|safe }}

This approach results in any html in a code block just rendering as html (ie, not showing up). I've been playing around with using the django escape function on the code extracted from body by my filter, but I can never quite seem to get it right. I think my understanding of the content escaping just isn't complete enough. I've also tried writing the escaped version in the post body (eg <), but it just comes out as text.

What is the best way to mark the html for display? Am I going about this all wrong?

Thanks.

I've finally found some time to figure it out. When beautiful soup pulls in the content and it contains a tag, the tag is listed as a sub node of a list. This line is the culprit:

new_content = pygments.highlight(code.contents[0], lexer, formatter)

The [0] cuts off the other part of the code, it isn't being decoded incorrectly. Poor bug spotting on my part. That line needs to be replaced with:

new_content = pygments.highlight(code.decodeContents(), lexer, formatter)

The lessons here are make sure you know what the problem is, and know how your libraries work.

继续阅读：django escaping pygments python

Django syntax highlighting causing character escaping issues

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？