开发者

Raising events and object persistence in Django

I have a tricky Django problem which didn't occur to me when I was developing it. My Django application allows a user to sign up and store his login credentials for a sites. The Django application basically allows the user to search this other site (by scraping content off it) and returns the result to the user. For each query, it does a couple of queries of the other site. This seemed to work fine but sometimes, the other site slaps me with a CAPTCHA. I've written the code to get the CAPTCHA image and I need to return this to the user so h开发者_开发百科e can type it in but I don't know how.

My search request (the query, the username and the password) in my Django application gets passed to a view which in turn calls the backend that does the scraping/search. When a CAPTCHA is detected, I'd like to raise a client side event or something on those lines and display the CAPTCHA to the user and wait for the user's input so that I can resume my search. I would somehow need to persist my backend object between calls. I've tried pickling it but it doesn't work because I get the Can't pickle 'lock' object error. I don't know to implement this though. Any help/ideas?

Thanks a ton.


Something else to remember: You need to maintain a browser session with the remote site so that site knows which CAPTCHA you're trying to solve. Lots of webclients allow you to store your cookies and I'd suggest you dump them in the Django Session of the user you're doing the screen scraping for. Then load them back up when you submit the CAPTCHA.

Here's how I see the full turn of events:

  1. User places search request
  2. Query remote site
  3. If not CAPTCHA, GOTO #10
  4. Save remote cookies in local session
  5. Download image captcha (perhaps to session too?)
  6. Present CAPTCHA to your user and a form
  7. User Submits CAPTCHA
  8. You load up cookies from #4 and submit the form as a POST
  9. GOTO #3
  10. Process the data off the page, present to user, high-five yourself.


request.session['name'] = variable will store it then,

variable = request.session['name'] will retrieve it. Remember though, its not a database, just a simple session store and shouldn't be relied on for anything critical

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜