开发者

deallocating memory for objects I haven't set to null

EDIT: Problem wasn't related to the question. It was indeed something wrong with my code, and actually, it was so simple that I don't want to put it on the internet. Thanks anyway.

I read in roughly 550k Active directory records and store them in a List, the class being a simple wrapper for an AD user. I then split the list of ADRecords into four lists, each containing a quarter of the total. After I do this, I read in about 400k records from a database, known as EDR records, into a DataTable. I take the four quarters of my list and spawn four threads, passing each one of the four quarters. I have to match the AD records to the EDR records using email right now, but we plan to add more things to match on later.

I have a foreach on the list of AD records, and inside of that, I have to run a for loop on the EDR records to check each one, because if an AD record matches more than one EDR record, then that isn't a direct match, and should not be treated as a direct match.

My problem, by the time I get to this foreach on the list, my ADRecords list only has about 130 records in it, but right after I pull them all in, I Console.WriteLine the count, and it's 544k.

I am starting to think that even though I haven't set the list to null to be collected later, C# or Windows or something is actually taking my list away to make room for the EDR records because I haven't used the list in a while. The database that I have to use to read EDR records is a linked server, so it takes about 10 minutes to read them all in, so my list is actually idle for 10 minutes, but it's never set to null.

Any ideas?

//splitting list and passing in values to threads.
List<ADRecord> adRecords = GetAllADRecords();
        for (int i = 0; i < adRecords.Count/4; i++)
        {
            firstQuarter.Add(adRecords[i]);
        }
        for (int i = adRecords.Count/4; i < adRecords.Count/2; i++)
        {
            secondQuarter.Add(adRecords[i]);
        }
        for (int i = adRecords.Count/2; i < (adRecords.Count/4)*3; i++)
        {
            thirdQuarter.Add(adRecords[i]);
        }
        for (int i = (adRecords.Count/4)*3; i < adRecords.Count; i++)
        {
            fourthQuarter.Add(adRecords[i]);
        }
        DataTable edrRecordsTable = GetAllEDRRecords();

        DataRow[] edrRecords = edrRecordsTable.Select("Email_Address is not null and Email_Address <> ''", "Email_Address");
        Dictionary<string, int> letterPlaces = FindLetterPlaces(edrRecords);
        Thread one = new Thread(delegate() { ProcessMatches(firstQuarter, edrRecords, letterPlaces); });
        Thread two = new Thread(delegate() { ProcessMatches(secondQuarter, edrRecords,  letterPlaces); });
        Thread three = new Thread(delegate() { ProcessMatches(thirdQuarter, edrRecords,  letterPlaces); });
        Thread four = new Thread(delegate() { ProcessMatches(fourthQuarter, edrRecords, letterPlaces); });
        one.Start();
        two.Start();
        three.Start();
        four.Start();

In Proce开发者_C百科ssMatches, there is a foreach on the List of ADRecords passed in. The first line in the foreach is AdRecordsProcessed++; which is a global static int, and the program finishes with it at 130 instead of the 544k.


The variable is never set to null and is still in scope? If so, it shouldn't be collected and idle time isn't your problem.

First issue I see is:

AdRecordsProcessed++; 

Are you locking that global variable before updating it? If not, and depending on how fast the records are processed, it's going to be lower than you expect.

Try running it from a single thread (i.e. pass in adRecords instead of firstQuarter and don't start the other threads.) Does it work as expected with 1 thread?


Firstly, you don't set a list to null. What you might do is set every reference to a list to null (or to another list), or all such references might simply fall out of scope. This may seem like a nitpick point, but if you are having to examine what is happening to your data it's time to be nitpicky on such things.

Secondly, getting the GC to deallocate something that has a live reference is pretty hard to do. You can fake it with a WeakReference<> or think you've found it when you hit a bug in a finaliser (because the reference isn't actually live, and even then its a matter of the finaliser trying to deal with a finalised rather than deallocated object). Bugs can happen everywhere, but that you've found a way to make the GC deallocate something that is live is highly unlikely.

The GC will be likely do two things with your list:

  1. It is quite likely to compact the memory used by it, which will move its component items around.
  2. It is quite likely to promote it to a higher generation.

Neither of these are going to have any changes you will detect unless you actually look for them (obviously you'll notice a change in generation if you keep calling GetGeneration(), but aside from that you aren't really going to).

The memory used could also be paged out, but it will be paged back in when you go to use the objects. Again, no effect you will notice.

Finally, if the GC did deallocate something, you wouldn't have a reduced number of items, you'd have a crash, because if objects just got deallocated the system will still try to use the supposedly live references to them.

So, while the GC or the OS may do something to make room for your other object, it isn't something observable in code, and it does not stop the object from being available and in the same programmatic state.

Something else is the problem.


Is there a reason you have to get all the data all at once? If you break the data up into chunks it should be more manageable. All I know is having to get into GC stuff is a little smelly. Best to look at refactoring your code.


The garbage collector will not collect:

  • A global variable
  • Objects managed by static objects
  • A local variable
  • A variable referencable by any method on the call stack

So if you can reference it from your code, there is no possibility that the garbage collector collected it. No way, no how.

In order for the collector to collect it, all references to it must have gone away. And if you can see it, that's most definitely not the case.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜