Building a table that summarizes data about another table - Optimization suggestions?
I am trying to build a table that summarizes how many times pages in our Wiki references certain macros.
I have built a temporary table called @currentpages that with 55k rows that looks like this:
DECLARE @currentpages table(
ContentID NUMERIC(19,0) NOT NULL PRIMARY KEY,
PageTitle VARCHAR(255) NULL,
SpaceKey VARCHAR(255) NULL,
OriginalAuthor VARCHAR(255) NULL,
LastChangedBy VARCHAR(255) NULL,
LastChangedDt VARCHAR(10) NULL,
ContentBody TEXT NULL);
and another table that looks like this:
DECLARE @usage table(
SpaceKey VARCHAR(255) NOT NULL PRIMARY KEY,
Macro1 NUMERIC(19,0) NULL,
Macro2 NUMERIC(19,0) NULL,
Macro3 NUMERIC(19,0) NULL);
(I've simplified the above; it actually has about 40 columns like Macro1, Macro2, etc.) I'm trying to get counts of how many references there are (in @currentpages) to these various macros.
I initialize the @usage table by creating a row for each possible SpaceKey value, zeroing out all of the Macro"x" counters:
INSERT INTO @usage (SpaceKey, Macro1, Macro2, Macro3)
SELECT S.spacekey, 0, 0, 0
FROM spaces S
ORDER BY S.spacekey
Then I run the first of several UPDATE statements to identify each macro reference:
UPDATE @usage
SET U.AdvancedSearch = C.Counter
FROM @usage U
INNER JOIN (SELECT SpaceKey, COUNT(*) AS Counter
FROM @currentpages
WHERE Conte开发者_运维技巧ntBody LIKE '%{search-%' GROUP BY SpaceKey) C
ON U.SpaceKey = C.SpaceKey
This appears to work fine, but it runs a very long time. Is there a more efficient way to do what I'm trying to do?
This database is on SQL Server 2005.
Many thanks, Betsy
Investigate full text search. (Full Text Search @ MSDN) I don't have any experience with it, so can not offer advice.
Instead of querying @CurrentPages once per macro, get all the macro counts at once. Something like:
INSERT INTO @usage (SpaceKey, Macro1, Macro2, Macro3) SELECT S.spacekey , count(case when ContentBody LIKE '%Search Macro 1%' then 1 else null end) as Macro1_Count , count(case when ContentBody LIKE '%Search Macro 2%' then 1 else null end) as Macro2_Count , count(case when ContentBody LIKE '%Search Macro 3%' then 1 else null end) as Macro=3Count FROM spaces S LEFT OUTER JOIN @CurrentPages C ON C.SpaceKey = S.SpaceKey GROUP BY S.spacekey
Don't make a column per macro, but make a table like this.
DECLARE @macrotype table(
type int NOT NULL PRIMARY KEY
, MacroName varchar(100) NOT NULL
, mask varchar(100) NOT NULL
)
DECLARE @usage table(
SpaceKey VARCHAR(255) NOT NULL PRIMARY KEY,
, MacroType int NOT NULL
, MacroCount int NOT NULL
)
Once you have that you can populate it by running
INSERT INTO @macrotype VALUES (1, 'search', '%{search-%')
, (2, 'yadayada', '%{yadayada-%')
, ...
DECLARE @c int, @m varchar(100)
DECLARE c CURSOR READ_ONLY FOR SELECT type, mask FROM @macrotype
OPEN c
FETCH NEXT FROM c INTO @t, @m
WHILE @@FETCH_STATUS = 0
BEGIN
INSERT INTO @usage (SpaceKey, MacroType, MacroCount)
SELECT SpaceKey
, @t
, COUNT(*) AS Counter
FROM @currentpages
WHERE ContentBody LIKE @m
GROUP BY SpaceKey
FETCH NEXT FROM c INTO @t, @m
END
CLOSE c
DEALLOCATE c
At the end of your story you can query the @usage table to make counts in whatever form you wish.
SELECT MacroName, count = count(*)
FROM @usage u
JOIN @macrotype m
ON u.MacroType - m.type
GROUP BY MacroName
On 55k records this will run for a few minutes.
I would imagine any performance hits are going to come from the subquery containing the LIKE clause since it's going to be doing a full table scan. I don't really see what else would slow this down too much.
You could test that by running just the sub-query and comparing the time it takes to return to the time the entire update takes for one of your columns.
精彩评论