SQL Concatenation filling up tempDB
We are attempting to concatenate possibly thousands of rows of text in SQL with a single query. The query that we currently have looks like this:
DECLARE @concatText NVARCHAR(MAX)
SET @concatText = ''
UPDATE TOP (SELECT MAX(PageNumber) + 1 FROM #OrderedPages) [#OrderedPages]
SET @concatText = @concatText + [ColumnText] + '
'
WHERE (RTRIM(LTRIM([ColumnText])) != '')
This is working perfectly fine from a functional standpoint. The only issue we're having is that sometimes the ColumnText can be a few kilobytes in length. As a result, we're filling up tempDB when we have thousands of these rows.
The best reason that we have come up with is that as we're doing these updates to @concatText, SQL is using implicit transactions so the strings are effectively immutable.
We are trying to figure out a good way of solving this problem and so far we have two possible solutions: 1) Do the concatenation in .NET. This is an OK option, but that's a lot of data that may go back across the 开发者_如何转开发wire.
2) Use .WRITE which operates in a similar fashion to .NET's String.Join method. I can't figure out the syntax for this as BoL doesn't cover this level of SQL shenanigans.
This leads me to the question: Will .WRITE work? If so, what's the syntax? If not, are there any other ways to do this without sending data to .NET? We can't use FOR XML
because our text may contain illegal XML characters.
Thanks in advance.
I'd look at using CLR integration, as suggested in @Martin's comment. A CLR aggregate function might be just the ticket.
What exactly is filling up tempdb? It cannot be @concatText = @concatText + [ColumnText]
, there is no immutability involved and the @concatText variable will be at worst case 2GB size (I expect your tempdb is much larger than that, if not increase it). It seems more like your query plan creates a spool for haloween protection and that spool is the culprit.
As a generic answer, using the UPDATE ... SET @var = @var + ...
for concatenation is known to have correctness issues and is not supported. Alternative approaches that work more reliably are discussed in Concatenating Row Values in Transact-SQL.
First, from your post, it isn't clear whether or why you need temp tables. Concatenation can be done inline in a query. If you show us more about the query that is filling up tempdb, we might be able to help you rewrite it. Second, an option that hasn't been mentioned is to do the string manipulation outside of T-SQL entirely. I.e., in your middle-tier query for the raw data, do the manipulation and push it back to the database. Lastly, you can use Xml such that the results handle escapes and entities properly. Again, we'd need to know more about what and how you are trying to accomplish.
Agreed..A CLR User Defined Function would be the best approach for what you guys are doing. You could actually read the text values into an object and then join them all together (inside the CLR) and have the function spit out a NVARCHAR(MAX) result. If you need details on how to do this let me know.
精彩评论