开发者

multithreading loop efficient? right?

I have the following multithreading function to implement threads fetching from a list of urls to parse content. The code was suggested by a user and I just want to know if this is an efficient way of implementing what I need to do. I am running the code now and 开发者_运维问答getting errors on all functions that worked fine doing single thread. for example now for the list that I use to check visited urls; I am getting the 'argumentoutofrangeexception - capacity was less than the current size'/ Does everything now need to be synchronized?

        Dim startwatch As New Stopwatch
        Dim elapsedTime As Long = 0
        Dim urlCompleteList As String = String.Empty
        Dim numThread As Integer = 0
        Dim ThreadList As New List(Of Thread)

        startwatch.Start()
        For Each link In completeList
            Dim thread = New Thread(AddressOf processUrl)
            thread.Start(link)
            ThreadList.Add(thread)
        Next

        For Each Thread In ThreadList
            Thread.Join()
        Next

        startwatch.Stop()
        elapsedTime = startwatch.ElapsedMilliseconds


    End Sub
enter code here Public Sub processUrl(ByVal url As String)

        'make sure we never visited this before
        If Not VisitedPages.Contains(url) Then
            **VisitedPages.Add(url)**
            Dim startwatch As New Stopwatch
            Dim elapsedTime As Long = 0


If the VisitedPages within processUrl is shared among the threads, then yes, you need to assure only one thread can access that collection at a time - unless that collection itself is thread safe and takes care of that for you.

Same thing with any other data that that's shared among the threads you create.


I am not seeing where VisitedPages is declared, but I do not see it local to the processUrl method. This would make is shared between all of the threads. This would cause a problem with multiple threads accessing the list/collection at the same time. Which would generate errors similar to what you describe. You will need to protect the VisitedPages collection with a mutex or something to guard against this.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜