Tutorial :T-SQL GROUP BY: Best way to include other grouped columns



Question:

I'm a MySQL user who is trying to port some things over to MS SQL Server.

I'm joining a couple of tables, and aggregating some of the columns via GROUP BY.

A simple example would be employees and projects:

select empID, fname, lname, title, dept, count(projectID)  from employees E left join projects P on E.empID = P.projLeader  group by empID  

...that would work in MySQL, but MS SQL is stricter and requires that everything is either enclosed in an aggregate function or is part of the GROUP BY clause.

So, of course, in this simple example, I assume I could just include the extra columns in the group by clause. But the actual query I'm dealing with is pretty complicated, and includes a bunch of operations performed on some of the non-aggregated columns... i.e., it would get REALLY ugly to try to include all of them in the group by clause.

So is there a better way to do this?


Solution:1

You can get it to work with something around these lines:

select e.empID, fname, lname, title, dept, projectIDCount  from  (     select empID, count(projectID) as projectIDCount     from employees E left join projects P on E.empID = P.projLeader     group by empID  ) idList  inner join employees e on idList.empID = e.empID  

This way you avoid the extra group by operations, and you can get any data you want. Also you have a better chance to make good use of indexes on some scenarios (if you are not returning the full info), and can be better combined with paging.


Solution:2

"it would get REALLY ugly to try to include all of them in the group by clause."

Yup - that's the only way to do it * - just copy and paste the non-aggregated columns into the group by clause, remove the aliases and that's as good as it gets...

*you could wrap it in a nested SELECT but that's probably just as ugly...


Solution:3

MySQL is unusual - and technically not compliant with the SQL standard - in allowing you to omit items from the GROUP BY clause. In standard SQL, each non-aggregate column in the select-list must be listed in full in the GROUP BY clause (either by name or by ordinal number, but that is deprecated).

(Oh, although MySQL is unusual, it is nice that it allows the shorthand.)


Solution:4

You do not need join in the subquery as it not necessary to make group by based on empID from employees - you can do it on projectLeader field from projects.

With the inner join (as I put) you'll get list of employees that have at least one project. If you want list of all employees just change it to left join

  select e.empID, e.fname, e.lname, e.title, e.dept, p.projectIDCount      from employees e      inner join ( select projLeader, count(*) as projectIDCount                    from projects                   group by projLeader                ) p on p.projLeader = e.empID  


Solution:5

A subquery in the select clause might also be suitable. It would work for the example given but might not for the actual complicated query you are dealing with.

select          e.empID, fname, lname, title, dept          , (select count(*) from projects p where p.projLeader = e.empId) as projectCount  from     from employees E  

Note:If u also have question or solution just comment us below or mail us on toontricks1994@gmail.com
Previous
Next Post »