BackgroundWorker and WebBrowser Control
Is it possible/recommended to use background worker threads with the web browser control?
I am creating a bot that searches google for keywords, then checks for sites in the first 10 pages to see if a site is ranked.
开发者_C百科The user can provide a maximum of 20 sites to check, and can use proxies. So ideally I'd like to have 5 threads working at once.
Is it possible? I might have heard somewhere that there are problems with WebBrowser control and threads.
It is not. WebBrowser uses Internet Explorer which is a COM component. COM components have a threading model, IE uses "Apartment". Which is an expensive word that means it is not thread-safe. You are allowed to call its methods in a BGW but COM will automatically marshal the call to the UI thread. Since all method calls and property accesses actually happen on the UI thread, you will make it slower by using a BGW.
You can in fact run WebBrowser on another thread, you'll have to create an instance of it on that thread. And you will have to create a thread that is a so-called Single Threaded Apartment. STA, an acronym you might well recognize from the [STAThread] attribute on the Main() method of a Winforms or WPF application. Changing a worker thread to STA requires calling Thread.SetApartmentState() before you start it. You cannot do this for a BGW. And the thread must pump a message loop to implement the STA contract, it must call Application.Run(). Required, for one, to get WebBrowser to raise its events. This answer shows the approach.
Consider using the WebRequest class.
Is there any reason you're using the IE control over a library such as HTML Agility pack? That has support multithreading without the COM nightmare of IE, and is a lot more powerful with HTML parsing.
To answer your immediate question: I've never tried it, but it wouldn't surprise me if there were problems. WinForms controls in general are not intended to be accessed from threads other than the main UI thread. You should use the Control.Invoke()
method to run invoke methods from other threads. This queues them up on the main UI thread.
To address the broader problem: you're probably better off not using a WebBrowser control at all if you don't need to actually render HTML for the user to see. You can download a page using the HttpWebRequest
class, which is much lighter. WebBrowser is basically full-blown Internet Explorer embedded in your application.
精彩评论