Sql Server Subquery Looping
I have a subquery that sums the days a Listing has been active. There are 3 different priorities for these listings that we want to analyze seperately. The subqueries works fine but I have to repeat myself 3 times and i can't help feeling like someone could point me in a more elegant direction.
I'm looking for something along the lines of
Declare @Priorities text = ['H','M','L']
Foreach(priority in priorities)
(SELECT SUM(
CASE
WHEN(jlh_inner.StartDate IS NULL) THEN 0
WHEN(jlh_inner.EndDate IS NULL) THEN
DATEDIFF(dd, jlh_inner.StartDate, GETDATE())
ELSE
DATEDIFF(dd, jlh_inner.StartDate, jlh_inner.EndDate)
END)
FROM ListingHistory jlh_inner
WHERE jlh_inner.JobTitleId = jlh.JobTitleId
AND jlh_inner.OfficeCode = rof.code
AND jlh_inner.Priority = priority)
Full code right now is:
BEGIN
SELECT rof.location AS location,
jlh.TitleId AS TitleId,
jt.Title AS Title,
(SELECT SUM(
CASE
WHEN(jlh_inner.StartDate IS NULL) THEN 0
WHEN(jlh_inner.EndDate IS NULL) THEN
DATEDIFF(dd, jlh_inner.StartDate, GETDATE())
ELSE
DATEDIFF(dd, jlh_inner.StartDate, jlh_inner.EndDate)
END)
FROM ListingHistory jlh_inner
WHERE jlh_inner.JobTitleId = jlh.JobTitleId
AND jlh_inner.OfficeCode = rof.code
AND jlh_inner.Priority = 'H') AS HighPriorityDays,
(SELECT SUM(
CASE
WHEN(jlh_inner.StartDate IS NULL) THEN 0
WHEN(jlh_inner.EndDate IS NULL) THEN
DATEDIFF(dd, jlh_inner.StartDate, GETDATE())
ELSE
DA开发者_StackOverflow社区TEDIFF(dd, jlh_inner.StartDate, jlh_inner.EndDate)
END)
FROM ListingHistory jlh_inner
WHERE jlh_inner.JobTitleId = jlh.JobTitleId
AND jlh_inner.OfficeCode = rof.code
AND jlh_inner.Priority = 'M') AS MediumPriorityDays,
(SELECT SUM(
CASE
WHEN(jlh_inner.StartDate IS NULL) THEN 0
WHEN(jlh_inner.EndDate IS NULL) THEN
DATEDIFF(dd, jlh_inner.StartDate, GETDATE())
ELSE
DATEDIFF(dd, jlh_inner.StartDate, jlh_inner.EndDate)
END)
FROM ListingHistory jlh_inner
WHERE jlh_inner.JobTitleId = jlh.JobTitleId
AND jlh_inner.OfficeCode = rof.code
AND jlh_inner.Priority = 'L') AS LowPriorityDays
FROM Offices rof,
ListingHistory jlh,
JobTitle jt
WHERE rof.code = jlh.OfficeCode
AND jt.JobTitleID = jlh.JobTitleId
GROUP BY rof.location, rof.code, jlh.TitleId, jt.Title
ORDER BY location
END
You have some very bad coding practices that you need to stop using entirely and correlated subqueries are one of them. They are performance killers and there is no reason to ever use one. Also you should start using explicit joins especially since you are using SQL Server and the inmplicit join syntax for outer joins not only is deprecated but is actually broken even back to SQL server 2000 and combining implicit inner and explicit out joins often leads to incorrect results and thus implicit joins are harder to maintain, and are much more likely to have accidental cross joins and are, of course, almost 20 years out of date.
See if this code does what you need:
SELECT rof.location AS location,
jlh.TitleId AS TitleId,
jt.Title AS Title,
SUM(CASE WHEN(jlh.StartDate IS NULL and Priority <> 'H') THEN 0
WHEN(jlh.EndDate IS NULL AND Priority = 'H')
THEN DATEDIFF(dd, jlh.StartDate, GETDATE())
WHEN (jlh.EndDate IS NOT NULL AND Priority = 'H') THEN DATEDIFF(dd, jlh.StartDate, jlh.EndDate) END) AS HighPriorityDays,
SUM(CASE WHEN(jlh.StartDate IS NULL and Priority <> 'M') THEN 0
WHEN(jlh.EndDate IS NULL AND Priority = 'M')
THEN DATEDIFF(dd, jlh.StartDate, GETDATE())
WHEN (jlh.EndDate IS NOT NULL AND Priority = 'M') THEN DATEDIFF(dd, jlh.StartDate, jlh.EndDate) END) AS MediumPriorityDays,
SUM(CASE WHEN(jlh.StartDate IS NULL and Priority <> 'L') THEN 0
WHEN(jlh.EndDate IS NULL AND Priority = 'L')
THEN DATEDIFF(dd, jlh.StartDate, GETDATE())
WHEN (jlh.EndDate IS NOT NULL AND Priority = 'L') THEN DATEDIFF(dd, jlh.StartDate, jlh.EndDate) END) AS LowPriorityDays
FROM Offices rof
JOIN ListingHistory jlh
ON rof.code = jlh.OfficeCode
JOIN JobTitle jt
ON jt.JobTitleID = jlh.JobTitleId
GROUP BY rof.location, rof.code, jlh.TitleId, jt.Title
ORDER BY location
Agree with HLGEM, using the join syntax is more appropriate. It also fits into the OUTER APPLY syntax. This achieves what you want with consise code without revisiting the table 3 times.
SELECT rof.location AS location,
jlh.TitleId AS TitleId,
jt.Title AS Title,
sum(case when PD.Priority='H' then PD.PriorityDays end) as HighPriorityDays,
sum(case when PD.Priority='M' then PD.PriorityDays end) as MediumPriorityDays,
sum(case when PD.Priority='L' then PD.PriorityDays end) as LowPriorityDays
FROM Offices rof
inner join ListingHistory jlh on rof.code = jlh.OfficeCode
inner join JobTitle jt on jt.JobTitleID = jlh.JobTitleId
outer apply
(SELECT jlh_inner.Priority, SUM(
CASE
WHEN(jlh_inner.StartDate IS NULL) THEN 0
WHEN(jlh_inner.EndDate IS NULL) THEN
DATEDIFF(dd, jlh_inner.StartDate, GETDATE())
ELSE
DATEDIFF(dd, jlh_inner.StartDate, jlh_inner.EndDate)
END) as PriorityDays
FROM ListingHistory jlh_inner
WHERE jlh_inner.JobTitleId = jlh.JobTitleId
AND jlh_inner.OfficeCode = rof.code
AND jlh_inner.Priority in ('H','L','M')
GROUP BY jlh_inner.Priority) PD
GROUP BY rof.location, rof.code, jlh.TitleId, jt.Title
ORDER BY location
Rather than declaring @priorities as an array of texts, make it a table, then insert your three values into the @priorities table and join your existing aggregation query on the newly-minted @priorities table. This will get you one result row per row of your priorities table, which I think is what you're really after, right?
Here's how I'd do the table setup in MS SQL:
declare @priorities table (
priority char(1)
)
insert into @priorities (priority) values ('H')
insert into @priorities (priority) values ('M')
insert into @priorities (priority) values ('L')
This would make your end implementation include something like this:
SELECT jlh_inner.Priority, SUM(
CASE
WHEN(jlh_inner.StartDate IS NULL) THEN 0
WHEN(jlh_inner.EndDate IS NULL) THEN
DATEDIFF(dd, jlh_inner.StartDate, GETDATE())
ELSE
DATEDIFF(dd, jlh_inner.StartDate, jlh_inner.EndDate)
END)
FROM ListingHistory jlh_inner
INNER JOIN @priorities p on jlh_inner.Priority = p.priorities
WHERE jlh_inner.JobTitleId = jlh.JobTitleId
AND jlh_inner.OfficeCode = rof.code
Then you add the Priority field to the GROUP BY clause you've already got, and things should aggregate up nicely.
Here's how I would refactor it using a common table expression (which represents the summaries grouped appropriately) and explicit joins:
WITH Summaries AS (
SELECT jlh.JobTitleId
,jlh.OfficeCode
,jlh.Priority
,SUM(
CASE
WHEN(jlh.StartDate IS NULL) THEN 0
WHEN(jlh.EndDate IS NULL) THEN
DATEDIFF(dd, jlh.StartDate, GETDATE())
ELSE
DATEDIFF(dd, jlh.StartDate, jlh.EndDate)
END
) AS DayCount
FROM ListingHistory jlh
GROUP BY jlh.JobTitleId
,jlh.OfficeCode
,jlh.Priority
)
SELECT rof.location AS location
,jlh.TitleId AS TitleId
,jt.Title AS Title
,s_h.DayCount AS HighPriorityDays
,s_m.DayCount AS MediumPriorityDays
,s_l.DayCount AS LowPriorityDays
FROM Offices rof
INNER JOIN ListingHistory jlh
ON rof.code = jlh.OfficeCode
INNER JOIN JobTitle jt
ON jt.JobTitleID = jlh.JobTitleId
LEFT JOIN Summaries s_h
ON s_h.JobTitleId = jlh.JobTitleId
AND s_h.OfficeCode = rof.code
AND s_h.Priority = 'H'
LEFT JOIN Summaries s_m
ON s_m.JobTitleId = jlh.JobTitleId
AND s_m.OfficeCode = rof.code
AND s_m.Priority = 'M'
LEFT JOIN Summaries s_l
ON s_l.JobTitleId = jlh.JobTitleId
AND s_l.OfficeCode = rof.code
AND s_l.Priority = 'L'
GROUP BY rof.location, rof.code, jlh.TitleId, jt.Title
ORDER BY location
精彩评论