Distinct not returning expected result?
What wrong with this code (or my sanity for that matter :D ).
The Distinct clause in the Select statement is NOT working. It's result set contains All rows for Each EmpID in ej. Please help!!
Declare @SurveyID int;
SET @SurveyID = 2;
IF OBJECT_ID('tempdb..#t') IS NOT NULL DROP TABLE #t
IF OBJECT_ID('tempdb..#t') IS NOT NULL DROP TABLE #t1
SELECT Distinct(ej.EmpID),
ej.JobCode,
ej.SurveyID,
ej.IsCompleted,
Coalesce(ej.Declined,0) AS Declined,
emp.Company,
emp.UserID,
emp.LoginRecord,
count(JobCode) AS Benchmark
into #t
FROM SAS.EmployerJobs ej
INNER JOIN SAS.Employer emp
ON ej.EmpID = emp.EmpID
Where (ej.SurveyID = @SurveyID)
GROUP BY ej.EmpID, ej.JobCode, ej.SurveyID, ej.IsCompleted, Decli开发者_开发问答ned, emp.Company, emp.UserID, emp.LoginRecord
Order by ej.EmpID ASC
Select * from #t
drop table #t
You cannot select distinct on specific columns in SQL Server.
Because of this SQL Server is returning every distinct combination of columns. What you are asking it to do is arbitrarily drop data that you have told it to select.
If you want it to start dropping out records then you will need to specify which ones you want. One way to do this is to specify MIN or MAX values for the columns that you do not want repeated.
I tend to avoid doing this because in my experience most of the time when this this is happening it is because of weird logic in the query. Without seeing your source data and knowing your business rules it would be hard to tell you how to change your query.
UPDATE:
based on your comment:
Business rule is to count number of JobCodes (as Benchmarks) for each EmpID in EmployerJobs Table (after filtering on SurveyID).
I would change your query to this:
SELECT EmpID,
COUNT(DISTINCT JobCode)
FROM SAS.EmployerJobs
Where SurveyID = @SurveyID
GROUP BY EmpId
This will return every EmpId
along with the number of distinct JobCode
s associated with that ID. If the business rule you gave me is accruate then the rest of your query is exactly the kind of weird logic that I was talking about.
If you only require one row for each distinct EmpID in the result set, then you need to decide how to aggregate the multiple rows of data (multiple jobs) for each EmpID. For example, choosing the minimum value.
But from the names and relations, it seems that an EmpID can have multiple jobs, so it doesn't seem to make sense to expect only one row per EmpID if you want also to list the individual jobs.
UPDATE: To count the number of JobCodes
SELECT ej.EmpID,
ej.SurveyID,
emp.Company,
emp.UserID,
emp.LoginRecord,
count(JobCode) AS Benchmark
into #t
FROM SAS.EmployerJobs ej
INNER JOIN SAS.Employer emp
ON ej.EmpID = emp.EmpID
Where (ej.SurveyID = @SurveyID)
GROUP BY ej.EmpID, ej.SurveyID, emp.Company, emp.UserID, emp.LoginRecord
Order by ej.EmpID ASC
Your original query attempted to pull in data that varies per job, and that is why you were not getting the aggregation. (E.g. IsCompleted, declined etc..) For aggregation to work, only include data that is an aggregate of the job data, like count(JobCode)
, or doesn't vary per job, such as employer attributes.
Distinct is not a function. See here for a great explanation.
http://weblogs.sqlteam.com/jeffs/archive/2007/10/12/sql-distinct-group-by.aspx
The distinct keyword as you have used it will only ensure that no duplicates over all of the returned columns will be included in the result set.
Use a group by clause to get your desired result.
http://msdn.microsoft.com/en-us/library/ms177673.aspx
Note that you will need to use an aggregate function (sum, max, avg for example) on any columns in the select clause that you do not include in th group by clause.
If you want the results to include only a single row for each employee ID then that is the only column that you should include in the group by clause.
Try this (note that a temp table adds no value here and has been removed).
SELECT
EmpID,
COUNT(*)
FROM
SAS.EmployerJobs
Where
SurveyID = @SurveyID
GROUP BY
EmpId
Note: Go with Abe Miessler's answer if it is possible for an employee to be associated multiple times with the same job.
What do you want it to do? Why do you have a Distinct and a Group By?
SELECT
ej.EmpID,
ej.JobCode,
ej.SurveyID,
ej.IsCompleted,
Coalesce(ej.Declined,0) AS Declined,
emp.Company,
emp.UserID,
emp.LoginRecord,
count(JobCode) AS Benchmark
FROM SAS.EmployerJobs ej
INNER JOIN SAS.Employer emp
ON ej.EmpID = emp.EmpID
Where ej.SurveyID = '2'
GROUP BY ej.EmpID, ej.JobCode, ej.SurveyID, ej.IsCompleted, Declined, emp.Company, emp.UserID, emp.LoginRecord
Order by ej.EmpID ASC
精彩评论