Is it bad practice to depend on the .NET automated garbage collector?
It's possible to create lots of memory-intensive objects and then abandon references to them. For example, I might want to download and operate on some data from a database, and I will do 100 separate download and processing iterations. I could declare a DataTable variable once, and for each query reset it to a new DataTable object using a constructor, abondoning the old DataTable object in memory.
The DataTable class has easy built-in ways to release the memory it uses, including Rows.Clear() and .Dispose(). So I could do this at the end of every iteration before setting the variable to a new DataTable object. OR I could forget about it and just let the CLR garbage collector do this for me. The garbage collector seems to be pretty effective so the end result should be the same either way. Is it "better" to explicitly dispose of memory-heavy objects when you don't need them, (but add code to do this) or just depend on the garbage collector to do all the work for you (you are at the mercy of the GC algorithm, but your code is smaller)?
Upon request, here is code illustrating the recycled DataTable variable example:
// queryList is list of 100 SELECT queries generated somewhere else.
// Each of them returns a million ro开发者_JAVA百科ws with 10 columns.
List<string> queryList = GetQueries(@"\\someserver\bunch-o-queries.txt");
DataTable workingTable;
using (OdbcConnection con = new OdbcConnection("a connection string")) {
using (OdbcDataAdapter adpt = new OdbcDataAdapter("", con)) {
foreach (string sql in queryList) {
workingTable = new DataTable(); // A new table is created. Previous one is abandoned
adpt.SelectCommand.CommandText = sql;
adpt.Fill(workingTable);
CalcRankingInfo(workingTable);
PushResultsToAnotherDatabase(workingTable);
// Here I could call workingTable.Dispose() or workingTable.Rows.Clear()
// or I could do nothing and hope the garbage collector cleans up my
// enormous DataTable automatically.
}
}
}
@Justin
...so, to answer your question, No. It is not a bad practice to depend on the .NET Garbage Collector. Quite the opposite in fact.
It is a horrible practice to rely on the GC to cleanup for you. It's unfortunate that you are recommending that. Doing so, can very likely lead you down the path of have a memory leak and yes, there are at least 22 ways you can "leak memory" in .NET. I've worked at a huge number of clients diagnosing both managed and unmanaged memory leaks, providing solutions to them, and have presented at multiple .NET user groups on Advanced GC Internals and how memory management works from the inside of the GC and CLR.
@OP: You should call Dispose() on the DataTable and explicitly set it equal to null at the end of the loop. This explicitly tells the GC that you are done with it and there are no more rooted references to it. The DataTable is being placed on the LOH because of its large size. Not doing this can easily fragment your LOH resulting in an OutOfMemoryException. Rememeber that the LOH is never compacted!
For additional details, please refer to my answer at
What happens if I don't call Dispose on the pen object?
@Henk - There is a relationship between IDisposable and memory management; IDisposable allows for an semi-explicit release of resources (if implemented correctly). And resources always have some sort of managed and typically unmanaged memory associated with them.
A couple of things to note about Dispose() and IDisposable here:
IDisposable provides for disposal of both Managed and Unmanaged memory. Disposal of Unmanaged memory should be done in the Dispose Method and you should provide a Finalizer for your IDisposable implementation.
The GC does not call Dispose for you.
If you don't call Dispose(), the GC sends it to the Finalization queue, and ultimately again to the f-reachable queue. Finalization makes an object survive 2 collections, which means it will be promoted to Gen1 if it was in Gen0, and to Gen2 if it was in Gen1. In your case, the object is on the LOH, so it survives until a full GC (all generations plus the LOH) is performed twice which, under a "healthy" .NET app, a single full collection is performed approx. 1 in every 100 collections. Since there is lots of pressure on the LOH Heap and GC, based on your implementation, full GC's will fire more often. This is undesirable for performance reasons since full GC's take much more time to complete. Then there is also a dependency on what kind of GC you're running under and if you are using LatencyModes (be very careful with this). Even if you're running Background GC (this has replaced Concurrent GC in CLR 4.0), the ephemeral collection (Gen0 and Gen1) still blocks/suspends threads. Which means no allocations can be performed during this time. You can use PerfMon to monitor the behavior of the memory utilization and GC activity on your app. Please note that the GC counters are updated only after a GC has taken place. For additional info on versions of GC, see my response to
Determining which garbage collector is running.
Dispose() immediately releases the resources associated with your object. Yes, GC is non-deterministic, but calling Dispose() does not trigger a GC!
Dispose() lets the GC know that you are done with this object and its memory can be reclaimed at the next collection for the generation where that object lives. If the object lives in Gen2 or on the LOH, that memory will not be reclaimed if either a Gen0 or Gen1 collection takes place!
The Finalizer runs on 1 thread (regardless of version of GC that is being used and the number of logical processors on the machine. If you stick alot in the Finalization and f-reachable queues, you only have 1 thread processing everything ready for Finalization; your performance goes you know where...
For info on how to properly implement IDisposable, please refer to my blog post:
How do you properly implement the IDisposable pattern?
Ok, time to clear things up a bit (since my original post was a little muddy).
IDisposable has nothing to do with Memory Management. IDisposable
allows an object to clean up any native resources it might be holding on to. If an object implements IDisposable
, you should be sure to either use a using
block or call Dispose()
when you're finished with it.
As for defining memory-intensive objects and then losing the references to them, that's how the Garbage Collector works. It's a good thing. Let it happen and let the Garbage Collector do its job.
...so, to answer your question, No. It is not a bad practice to depend on the .NET Garbage Collector. Quite the opposite in fact.
I also agree with Dave's post. You should always dispose and release your database connections, even if the framework you are working has documentation that it is not needed.
As a DBA who has worked with MS SQL, Oracle, Sybase/SAP, and MYSQL, I have been brought in to work on mysterious locking and memory leaking that was blamed on the database when in fact, the issue was because the developer did not close and destroy their connection objects when they were done with them. I've even seen apps that leave idle connections open for days and it can really make things bad when your database is clustered, mirrored, and with Always on recovery groups in SQL Server 2012.
When I took my first .Net class the instructor taught us to only keep database connections open while you are using them. Get in, get your work done and get out. This change has made several systems I have help optimize a lot more reliable. It also frees up connection memory in the RDBMS giving more ram to buffer IO.
精彩评论