开发者

How can I store HTML in a DB (SQLITE PYTHON)

This is probably quite simple, but I can't get there...

How can I store html code in a SQLITE Data Base?

I'm using text as the data-type for the field in the DB (should it be blob??)

I'm getting weird errors (and changing erros with the same input, so I think it has something to do with escaping)

MY CODE:

co开发者_Go百科n = sqlite.connect(bd)
cur = con.cursor()
temp=cur.execute ('SELECT * from posts').fetchall()
#temp[Z][1] = ID
#temp[Z][4] = URL
i=0
while i< len (temp):
    if temp[i][0]==None:
        try:
            html = urllib2.urlopen(str(temp[i][4])).read()
        except:
            html=None
        #sql = 'UPDATE posts SET html = "' + str(html) + '" WHERE  id = ' +  str(temp[i][1])
        #cur.execute( 'UPDATE posts SET html = ? WHERE  id = ?' ,(html,temp[i][1]) )
        cur.execute("UPDATE posts SET html = '" + str(html) + "' WHERE  id = " +  str(temp[i][1]))
        con.commit()
        print temp[i][4]
    i=i+1

The errors:

1 -

OperationalError: near "2": syntax error WARNING: Failure executing file: Python 2.6.5 (r265:79063, Apr 16 2010, 13:09:56) Type "copyright", "credits" or "license" for more information.

2-

ProgrammingError: You must not use 8-bit bytestrings unless you use a text_factory that can interpret 8-bit bytestrings (like text_factory = str). It is highly recommended that you instead just switch your application to Unicode strings.

P.s. I would rather If it would be text (human readble) than blob, but if it's the easier way, I'm all for it.

Thanx


Try:

cur.execute(
    "UPDATE posts SET html = ? WHERE id = ?", (html ,temp[i][1]))

Use parameterized arguments to allow sqlite3 to escape the quotes for you. (It also helps prevent SQL injection.)

Regarding the ProgrammingError: html should be a unicode object, rather than a string object. When you open the url:

response=urllib2.urlopen(str(temp[i][4]))

Look at the content type header:

content_type=response.headers.getheader('Content-Type')
print(content_type)

It might say something like

'text/html; charset=utf-8'

in which case you should decode the html string with the utf-8 codec:

html = response.read().decode('utf-8')

This will make html a unicode object, and (hopefully) address the ProgrammingError.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜