Memory implications of returning a query from a CFC
I've written a database load script in ColdFusion and I'm having a problem that the script slowly runs out of memory. I've split each table load into its own thread with <cfthread> and I'm calling the garbage collector when memory dips below 50% (making sure to have 30 seconds between gc() calls to prevent the garbage collector from hogging memory).
I created a CFC to hold all the queries needed by the script. The script calls the appropriate CFC function which then returns the query, some of which are over 2 MB in size. When I look in the Server Monitor in the details view of the Memory page for Active Threads, it looks like my CFC is keeping a copy of the query in memory even though I varscoped the query variable and the variable went out of scope at the end of the function. In addition, I have a copy of the query in memory in my thread. So I'm left with what looks like two copies of the query in memory. Is this really what's happening? If it is, how can I eliminate one开发者_开发百科 copy of the query from memory?
There's a lot of potential issues here, but I'll try to underline some of the most important things for you to consider:
- Why the threads? Do you need the threads? There's a certain point at which you're probably tinkering too much for your own good.
- Manually forcing garbage collection isn't necessarily a good idea. Tune the JVM to perform its garbage collection automatically, but don't overdo it, either. Garbage Collection tends to be expensive, and can impact the performance of your app if it is running too frequently.
- How are you instantiating your CFC? If you are instantiating the CFC on every request for the query, you're going to experience RAM issues over time, a slow memory leak as CFCs are loaded up into RAM too quickly for garbage collection to keep up. Your best bet is to make this a singleton. (ie., set it into the application scope).
- Be aware that var-scoping a variable doesn't (as far as I understand it) automatically free up the memory as soon as the variable stops being used. The memory is still reserved, though it's likely flagged somehow as being part of a short-lived generation so that it will (probably?) be cleaned up faster. But this doesn't guarantee anything.
- If you're looking at active threads, it's also possible that the query isn't going to be cleared until the end of the request--not necessarily the end of the function call. It seems impatience that would motivate you to expect a query to immediately die as soon as the function call is completed.
- ColdFusion queries are passed by reference, not by value. It should be impossible to get 2 copies of the query in memory, unless you're somehow using duplicate() or a similar function to explicitly copy the query.
The query is likely returning a pointer to the query from your cfreturn statement. That query will not be cleaned up until all processes are done referencing it. So if it passes the query to some other process, you're not going to get that query cleaned out of memory. If you set that query to a session variable, for instance, that pointer isn't going anywhere until that session variable is gone, no matter how frequently you try to force garbage collection.
Just a few things to consider.
I had a similar problem with processing a large data insert, where each row requires extensive processing involving multiple CFCs. It appears that th JDBC ResultSet, Statement and Connection references created by <cfquery> are held until the end of the request. This means that nulling your query variable has no affect on memory usage. The way I got around this was to make a gateway call to a CFC function to processes 100 rows, then that function makes another gateway call for the next 100 rows etc until all rows are processed. Because each individual gateway call actually exits, it releases all it's handles and that memory gets recovered.
精彩评论