开发者

Foreach loop takes a very long time to break out of

Scraping a webpage, containing about 250 table divisions. Using WatiN and WatinCSSSelectors

First I select all td tags with attribute 'width=90%':

var allMainTDs = browser.CssSelectAll("td[width=\"90%\"]");

Then I make a foreach loop, stick the contents of the var into a List. The int is the开发者_运维知识库re to check what td tag the loop is currently at.

List<Element> eletd = new List<Element>();
int i = 0;
foreach (Element td in allMainTDs)
{
    eletd.Add(td);
    i++;
    Console.WriteLine(i);                    
}

It reaches the 250th tag fairly quickly. But then it takes about 6 minutes (timed with a StopWatch object) to go onto the next statement. What is happening here?


You could try this:

var eletd = new List<Element>(allMainTDs);


A foreach loop is roughly equivalent to the following code (not exactly, but close enough):

IEnumerator<T> enumerator = enumerable.GetEnumerator();
try
{
    while (enumerator.MoveNext())
    {
        T element = enumerator.Current;
        // here goes the body of the loop
    }
}
finally
{
    IDisposable disposable = enumerator as System.IDisposable;
    if (disposable != null) disposable.Dispose();
}

The behavior you describe points to the cleanup portion of this code. It's possible that the enumerator for the result of the CssSelectAll call has a heavy Dispose method. You could confirm this by replacing your loop with something like the code above, and omit the finally block, or set breakpoints to confirm Dispose takes forever to run.


If you under .net 4.0 and you execution environment allows for parallelism, you may be should try the

  Prallel.ForEach(..);
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜