SQL - turning a trace table into run-time statistics
I have a trace table in the following format.
CREATE TABLE [dbo].[trace](
[trcId] [bigint] IDENTITY(开发者_运维技巧1,1) NOT NULL,
[trcDateTime] [datetime] NULL,
[trcProgram] [nvarchar](150) NULL,
[trcCode] [nvarchar](8) NULL,
[trcText] [nvarchar](max) NULL,
[trcXML] [nvarchar](max) NULL,
[trcCorrGuid] [nvarchar](36) NULL, ...
Now I'm realizing that I can use selected rows from the trace to analyze response time. I'd like to create another table as follows:
CREATE TABLE [dbo].[executionHistory](
trcCorrId [nvarchar](36) NOT NULL,
startHIP datetime NOT NULL,
stopHIP datetime NOT NULL,
startOrch1 datetime NOT NULL,
stopOrch1 datetime NOT NULL,
startOrch2 datetime NOT NULL,
stopOrch2 datetime NOT NULL,
startWebMethoddatetime NOT NULL,
stopWebMethod datetime NOT NULL,
Or alternatively could create a more generic:
CREATE TABLE [dbo].[executionHistory](
trcCorrId [nvarchar](36) NOT NULL,
eventName [nvarchar](36) NOT NULL,
eventStart datetime NOT NULL,
eventStop datetime NOT NULL
Baiscally I have a C# (HIP) program that calls a BizTalk orchestration (orch1 -published as a WCF web service). That orchestration calls orch2 which callas another WCF web service. I'd like to record/analyze/summarize response times from the above table instead of my trace table.
The start of the HIP program can be determined as follows: where trcProgram = 'HIP' and trcCode = '0250' and the end of the HIP program where trcProgram = 'HIP' and trcCode = '0299'.
I have similar pattern for each of the four milestones, for example: start Orch1: where trcProgram = 'Orch1' and trcCode = '0010' and end Orch1: where trcProgram = 'Orch1' and trcCode = '9999'.
Each execution is uniquely identified with the trcCorrGuid. This is a GUID that is stamped on all rows related to the same execution.
How can I do an elegant and efficient SQL query to load the executionHistory table from the trace table? I toyed around for a while but realized that I what I was writing could be really sloppy - and take several passes - and I needed advice from someone who might have done something like this.
I was going to begin by getting a distinct list of Guids. I was then going to do something like a CASE statement, but need a CASE that can handle two values, not one (trcProgram and trcCode). Maybe nested CASE statements would work?
Thanks in advance,
Neal Walters
Update:
Here's what I'm working on so far:
select trace.trcDateTime,
EventName =
CASE trace.trcCode
-- HIP
WHEN '0250' THEN
CASE trace.trcProgram
WHEN 'HIP:RCT.HIP.Components:Push' THEN 'Start:HIP'
ELSE 'NA'
END
WHEN '0299' THEN
CASE trace.trcProgram
WHEN 'HIP:RCT.HIP.Components:Push' THEN 'Stop:HIP'
ELSE 'NA'
END
-- Orch 1
WHEN '0010' THEN
CASE trace.trcProgram
WHEN 'Orch:WCFSubmitPolicyAction' THEN 'Start:Orch:WCFSubmitPolicyAction'
WHEN 'Orch:CallRCTWebService' THEN 'Start:Orch:CallRCTWebService'
ELSE 'NA'
END
WHEN '9999' THEN
CASE trace.trcProgram
WHEN 'Orch:WCFSubmitPolicyAction' THEN 'Stop:Orch:WCFSubmitPolicyAction'
WHEN 'Orch:CallRCTWebService' THEN 'Stop:Orch:CallRCTWebService'
ELSE 'NA'
END
-- WebMethod Push
WHEN '1210' THEN
CASE trace.trcProgram
WHEN 'WebMethod:CreateValuationMinimal' THEN 'Start:WebMethod:CreateValuationMinimal'
ELSE 'NA'
END
WHEN '1289' THEN
CASE trace.trcProgram
WHEN 'Orch:CallRCTWebService' THEN 'Stop:WebMethod:CreateValuationMinimal'
ELSE 'NA'
END
-- WebMethod Pull
WHEN '1950' THEN
CASE trace.trcProgram
WHEN 'WebMethod:ExportValuationRecordIdCustom' THEN 'Start:WebMethod:ExportValuationRecordIdCustom'
ELSE 'NA'
END
WHEN '1951' THEN
CASE trace.trcProgram
WHEN 'WebMethod:ExportValuationRecordIdCustom' THEN 'Stop:WebMethod:ExportValuationRecordIdCustom'
ELSE 'NA'
END
ELSE 'NA'
END
from trace
I could store this to a temp table, and then work forward from there.
Based on Aaron's post, here's what I got working so far. I had to add the semi-colon after the "use" statement, and had to add "AS PivotTable" at the bottom.
use ESBSupport;
WITH trace_CTE AS
(
SELECT trcCorrId,
trcProgram + trcCode AS trcUniqueCode,
trcDateTime
FROM trace
WHERE (trcProgram = 'HIP:RCT.HIP.Components:Push' AND trcCode IN ('0250', '0299'))
OR (trcProgram = 'Orch:WCFSubmitPolicyAction' AND trcCode IN ('0010', '9999'))
)
SELECT
trcCorrId,
[HIP:RCT.HIP.Components:Push0250] AS startHIP,
[HIP:RCT.HIP.Components:Push0299] AS stopHIP,
[Orch:WCFSubmitPolicyAction0010] AS startOrch1,
[Orch:WCFSubmitPolicyAction9999] AS stopOrch1
-- // etc., continue this for the other events
FROM trace_CTE
PIVOT
(
MIN(trcDateTime)
FOR trcUniqueCode IN (
[HIP:RCT.HIP.Components:Push0250],
[HIP:RCT.HIP.Components:Push0299],
[Orch:WCFSubmitPolicyAction0010],
[Orch:WCFSubmitPolicyAction9999]
)
) as PivotTable
I don't know if you would call this elegant or efficient, but it is probably the best you can do:
WITH trace_CTE AS
(
SELECT trcCorrId, trcProgram + trcCode AS trcUniqueCode, trcDateTime
FROM trace
WHERE (trcProgram = 'HIP' AND trcCode IN ('0250', '0299'))
OR (trcProgram = 'Orch1' AND trcCode IN ('0010', '9999'))
OR ([more conditions here])
)
SELECT
trcCorrId,
[HIP0250] AS startHIP, [HIP0299] AS stopHIP,
[Orch10010] AS startOrch1, [Orch19999] AS stopOrch1
-- // etc., continue this for the other events
FROM trace_CTE
PIVOT
(
MIN(trcDateTime)
FOR trcUniqueCode IN
(
[HIP0250], [HIP0299],
[Orch10010], [Orch19999],
[(continue with other codes)]
)
)
I'm using a CTE just to clean up the syntax, it won't actually create another pass. PIVOT
is pretty efficient, although if your trace table is massive, this is still going to be slow.
In my experience, these problems are best handled with triggers whenever possible (context accumulation). Especially for what is probably set up as a write-only table, the overhead is minimal to do a few checks on INSERT, and it's going to be a lot more painful to do the same thing on-the-fly. If you don't do this, you'll probably end up with a nightly batch process crunching the numbers and putting them into an analysis table so users don't have to wait for the query to run.
Final note: In order to get any kind of performance whatsoever on this query, you are definitely going to need an index on (trcProgram, trcCode) that covers (trcCorrId, trcDateTime).
Food for thought: NVARCHAR(36) takes 72 bytes of storage + length info (variable length guids?). Uniqueidentifier
takes 16 bytes.
And you want to create an eventName nvarchar(36) based on... what? The trcProgram? That is nvarchar(150).
SELECT coalesce(start.trcCorrGuid, end.trcCorrGuid),
coalesce(start.trcProgram, end.trcProgram) as eventName,
start.trcDateTime as eventStart,
end.trcDateTime as eventEnd
FROM (
SELECT * FROM trace
WHERE trcCode IN ('0250', '0010', ...))
as start
FULL JOIN (
SELECT * FROM trace
WHERE trcCode IN ('0299', '9999', ...))
as end ON start.trcCorrGuid = end.trcCorrGuid
AND start.trcProgram = end.trcProgram;
This query uses a full join between start and end events to allow for any trace inaccuracies (correlations that miss either a stop either a start). The performance of the query will depend on what indexes are on [trace]. If all you have is a primary key of trcID then no query can do miracles, is better you write a cursor loop then. The query in my example would need an index on trcCorrGuid at least, and preferably one on (trcCorrGuid) include (trcCode, trcProgram)
精彩评论