How do you set up multiple multithreaded QWebViews in PyQt?
I am trying to make an application in Python using PyQt that can fetch the generated content of a list of URLs and process the fetched source with the help of multiple threads. I need to run about ten QWebViews at once. As ridiculous as that might sound, when it comes to hundreds of URLs, using threaded QWebViews gets the results over 3 times faster than normal.
Here is the test code that I have been having problems with...
import sys
from PyQt4.QtCore import *
from PyQt4.QtGui import *
from PyQt4.QtWebKit import *
class Worker(QThread):
def __init__(self, url, frame):
QThread.__init__(self)
self.url = url
self.frame = frame
def run(self):
self.frame.load(QUrl(self.url))
print len(self.frame.page().mainFrame().toHtml())
app = QApplication(sys.argv)
webFrame = QWebView()
workerList = []
for x in range(1):
worker = Worker('http://www.google.com', webFrame)
workerList.append(worker)
for worker in workerList:
worker.start()
sys.exit(app.exec_())
Above, I tried initializing the QWebView in the main QApplication only to get:
QObject开发者_StackOverflow中文版: Cannot create children for a parent that is in a different thread.
So then I tried initializing the QWebView in the QThread; but then, the QWebView remained unchanged and blank without outputting any errors or anything. This was probably due to a cache error.
I have the feeling that I am missing out on something or skipping a very important step. Since threaded QWebViews in PyQt isn't a really documented topic, I would really appreciate any help on how to successfully implement this.
There are multiple issues with your question and code:
- You are talking about QWebFrame, but are actually passing a QWebView to your worker(s). Since this is a QWidget, it belongs to the main (GUI) thread and should not be modified by other threads.
- One QWebView / QWebFrame can only load one URL at a time, so you cannot share it across multiple workers.
- QWebFrame.load() loads data asynchronously, i.e. a call to load() returns immediately and there will be no data to read yet. You will have to wait for the loadFinished() signal to be emitted before you can access the data.
- Since the actual loading is done by the networking layer of the operating system, and the load() method does not block, there is no need to run it in a separate thread in the first place. Why do you claim that this should be faster -- it makes no sense.
- Since you want to load hundreds of URLs in parallel (or about 10, you are mentioning both in the same sentence), are you sure that you want to use QWebFrame, which is a presentation class? Do you actually want to render the HTML or are you just interested in the data retrieved?
精彩评论