Foreach loop takes a very long time to break out of
Scraping a webpage, containing about 250 table divisions. Using WatiN and WatinCSSSelectors
First I select all td tags with attribute 'width=90%':
var allMainTDs = browser.CssSelectAll("td[width=\"90%\"]");
Then I make a foreach loop, stick the contents of the var into a List. The int is the开发者_运维知识库re to check what td tag the loop is currently at.
List<Element> eletd = new List<Element>();
int i = 0;
foreach (Element td in allMainTDs)
{
eletd.Add(td);
i++;
Console.WriteLine(i);
}
It reaches the 250th tag fairly quickly. But then it takes about 6 minutes (timed with a StopWatch object) to go onto the next statement. What is happening here?
You could try this:
var eletd = new List<Element>(allMainTDs);
A foreach
loop is roughly equivalent to the following code (not exactly, but close enough):
IEnumerator<T> enumerator = enumerable.GetEnumerator();
try
{
while (enumerator.MoveNext())
{
T element = enumerator.Current;
// here goes the body of the loop
}
}
finally
{
IDisposable disposable = enumerator as System.IDisposable;
if (disposable != null) disposable.Dispose();
}
The behavior you describe points to the cleanup portion of this code. It's possible that the enumerator for the result of the CssSelectAll
call has a heavy Dispose method. You could confirm this by replacing your loop with something like the code above, and omit the finally block, or set breakpoints to confirm Dispose
takes forever to run.
If you under .net 4.0 and you execution environment allows for parallelism, you may be should try the
Prallel.ForEach(..);
精彩评论