How can I store HTML in a DB (SQLITE PYTHON)
This is probably quite simple, but I can't get there...
How can I store html code in a SQLITE Data Base?
I'm using text as the data-type for the field in the DB (should it be blob??)
I'm getting weird errors (and changing erros with the same input, so I think it has something to do with escaping)
MY CODE:
co开发者_Go百科n = sqlite.connect(bd)
cur = con.cursor()
temp=cur.execute ('SELECT * from posts').fetchall()
#temp[Z][1] = ID
#temp[Z][4] = URL
i=0
while i< len (temp):
if temp[i][0]==None:
try:
html = urllib2.urlopen(str(temp[i][4])).read()
except:
html=None
#sql = 'UPDATE posts SET html = "' + str(html) + '" WHERE id = ' + str(temp[i][1])
#cur.execute( 'UPDATE posts SET html = ? WHERE id = ?' ,(html,temp[i][1]) )
cur.execute("UPDATE posts SET html = '" + str(html) + "' WHERE id = " + str(temp[i][1]))
con.commit()
print temp[i][4]
i=i+1
The errors:
1 -
OperationalError: near "2": syntax error WARNING: Failure executing file: Python 2.6.5 (r265:79063, Apr 16 2010, 13:09:56) Type "copyright", "credits" or "license" for more information.
2-
ProgrammingError: You must not use 8-bit bytestrings unless you use a text_factory that can interpret 8-bit bytestrings (like text_factory = str). It is highly recommended that you instead just switch your application to Unicode strings.
P.s. I would rather If it would be text (human readble) than blob, but if it's the easier way, I'm all for it.
Thanx
Try:
cur.execute(
"UPDATE posts SET html = ? WHERE id = ?", (html ,temp[i][1]))
Use parameterized arguments to allow sqlite3 to escape the quotes for you. (It also helps prevent SQL injection.)
Regarding the ProgrammingError: html
should be a unicode object, rather than a string
object. When you open the url:
response=urllib2.urlopen(str(temp[i][4]))
Look at the content type header:
content_type=response.headers.getheader('Content-Type')
print(content_type)
It might say something like
'text/html; charset=utf-8'
in which case you should decode the html
string with the utf-8
codec:
html = response.read().decode('utf-8')
This will make html
a unicode object, and (hopefully) address the ProgrammingError
.
精彩评论