Tutorial :Simplifying (aliasing) T-SQL CASE statements. Any improvement possible?



Question:

As you can see, this sucks big time. Any alternative? I've tried using the column alias in the group by clause to no avail.

select count(callid) ,  case          when callDuration > 0 and callDuration < 30 then 1          when callDuration >= 30 and callDuration < 60 then 2          when callDuration >= 60 and callDuration < 120 then 3          when callDuration >= 120 and callDuration < 180 then 4          when callDuration >= 180 and callDuration < 240 then 5          when callDuration >= 240 and callDuration < 300 then 6          when callDuration >= 300 and callDuration < 360 then 7          when callDuration >= 360 and callDuration < 420 then 8          when callDuration >= 420 and callDuration < 480 then 9          when callDuration >= 480 and callDuration < 540 then 10          when callDuration >= 540 and callDuration < 600 then 11          when callDuration >= 600 then 12  end as duration  from callmetatbl  where programid = 1001 and callDuration > 0  group by case          when callDuration > 0 and callDuration < 30 then 1          when callDuration >= 30 and callDuration < 60 then 2          when callDuration >= 60 and callDuration < 120 then 3          when callDuration >= 120 and callDuration < 180 then 4          when callDuration >= 180 and callDuration < 240 then 5          when callDuration >= 240 and callDuration < 300 then 6          when callDuration >= 300 and callDuration < 360 then 7          when callDuration >= 360 and callDuration < 420 then 8          when callDuration >= 420 and callDuration < 480 then 9          when callDuration >= 480 and callDuration < 540 then 10          when callDuration >= 540 and callDuration < 600 then 11          when callDuration >= 600 then 12  end  

EDIT: I really meant to ask how to have a single case source, but case modifications are welcome anyway (although less useful because the intervals probably will be modified and might even be automatically generated).

As has been considered by some people, callDuration is indeed a float so some listed solutions are not valid for my use case, by leaving values out of the intervals.

Lessons:

  • Look for patterns in the case expression to reduce it if possible and worthwhile

     case      when callDuration > 0 AND callDuration < 30 then 1      when callDuration > 600 then 12      else floor(callDuration/60) + 2  end   end as duration  
  • Use inline views to have a single source of the case

    select count(d.callid), d.duration  from (        select callid          , case             when callDuration > 0 AND callDuration < 30 then 1             when callDuration > 600 then 12             else floor(callDuration/60) + 2  end            end as duration      from callmetatbl      where programid = 1001            and callDuration > 0  ) d  group by d.duration  
  • Or use common table expressions

       with duration_case as (        select callid ,        case          when callDuration > 0 AND callDuration < 30 then 1          when callDuration > 600 then 12          else floor(callDuration/60) + 2  end        end as duration     from callmetatbl     where programid = 1001 and callDuration > 0 )      select count(callid), duration      from duration_case      group by duration  
  • Or use an user defined function (no example so far :-) )

  • Or use a lookup table and a join

    DECLARE @t TABLE(durationFrom float, durationTo float, result INT)  --populate table with values so the query works  select count(callid) , COALESCE(t.result, 12)  from callmetatbl JOIN @t AS t ON callDuration >= t.durationFrom   AND callDuration < t.durationTo   where programid = 1001 and callDuration > 0  

Thanks to everybody and I'm having a very difficult time choosing an accepted answer, as many covered different parts of the question (and I was there thinking it was a simple question with a straightforward answer :-), sorry for the confusion).


Solution:1

Q: how to get an alias to use in the GROUP BY clause

One approach is to use an inline view. [EDIT] The answer from Remus Rusanu (+1!) gives an example of a Common Table Expression to accomplish the same thing. [/EDIT]

The inline view gets you a simple "alias" for the complex expression which you can then reference in a GROUP BY clause in an outer query:

select count(d.callid)       , d.duration    from (select callid               , case                 when callDuration >= 600 then 12                 when callDuration >= 540 then 11                 when callDuration >= 480 then 10                 when callDuration >= 420 then 9                 when callDuration >= 360 then 8                 when callDuration >= 300 then 7                 when callDuration >= 240 then 6                 when callDuration >= 180 then 5                 when callDuration >= 120 then 4                 when callDuration >=  60 then 3                 when callDuration >=  30 then 2                 when callDuration >    0 then 1                 --else null                 end as duration               from callmetatbl              where programid = 1001                and callDuration > 0         ) d  group by d.duration  

Let's unpack that.

  • the inner (indented) query is called and inline view (we given it an alias d)
  • in the outer query, we can reference the alias duration from d

That should be sufficient to answer your question. If you're looking for an equivalent replacement expression, the one from tekBlues (+1 !) is the right answer (it works on the boundary and for non-integers.)

With the replacement expression from tekBlues (+1!):

select count(d.callid)       , d.duration    from (select callid               , case                  when callduration >=30 and callduration<600                      then floor(callduration/60)+2                 when callduration>0 and callduration< 30                      then 1                  when callduration>=600                      then 12                 end as duration            from callmetatbl           where programid = 1001             and callDuration > 0         ) d   group by d.duration  

(This should be sufficient to answer your question.)


[UPDATE:] sample user defined function (a replacement for inline CASE expression)

CREATE FUNCTION [dev].[udf_duration](@cd FLOAT)  RETURNS SMALLINT  AS  BEGIN    DECLARE @bucket SMALLINT    SET @bucket =     CASE    WHEN @cd >= 600 THEN 12    WHEN @cd >= 540 THEN 11    WHEN @cd >= 480 THEN 10    WHEN @cd >= 420 THEN 9    WHEN @cd >= 360 THEN 8    WHEN @cd >= 300 THEN 7    WHEN @cd >= 240 THEN 6    WHEN @cd >= 180 THEN 5    WHEN @cd >= 120 THEN 4    WHEN @cd >=  60 THEN 3    WHEN @cd >=  30 THEN 2    WHEN @cd >    0 THEN 1    --ELSE NULL    END    RETURN @bucket  END    select count(callid)       , [dev].[udf_duration](callDuration)    from callmetatbl   where programid = 1001     and callDuration > 0   group by [dev].[udf_duration](callDuration)  

NOTES: be aware that the user defined function will add overhead, and (of course) add a dependency on another database object.

This example function is equivalent to the original expression. The OP CASE expression doesn't have any gaps, but it does reference each "breakpoint" twice, I prefer to test only the lower bound. (CASE returns when a condition is satisfied. Doing the tests in reverse lets the unhandled case (<=0 or NULL) fall through without test, an ELSE NULL is not necessary, but could be added for completeness.

ADDITIONAL DETAILS

(Be sure to check the performance and the optimizer plan, to make sure it's the same as (or not significantly worse than) the original. In the past, I've had problems getting predicates pushed into the inline view, doesn't look like it will be a problem in your case.)

stored view

Note that the inline view could also be stored as view definition in the database. But there's no reason to do that, other than to "hide" the complex expression from your statement.

simplifying the complex expression

Another way to make a complex expression "simpler" is to use a user defined function. But a user defined function comes with its own set of issues (including degraded performance.)

add database "lookup" table

Some answers recommend adding a "lookup" table to the database. I don't see that this is really necessary. It could be done of course, and could make sense if you want to be able to derive different values for duration from callDuration, on the fly, without having to modify your query and without having to run any DDL statements (e.g. to alter a view definition, or modify a user defined function).

With a join to a "lookup" table, one benefit is that you could make the query return different result sets by just performing DML operations on the "lookup" table.

But that same advantage may actually be a drawback as well.

Consider carefully if the benefit actually outweighs the downside. Consider the impact that new table will have on unit testing, how to verify the contents of the lookup table are valid and not changed (any overlaps? any gaps?), impact on ongoing maintenance to the code (due to the additional complexity).

some BIG assumptions

A lot of the answers given here seem to assume that callDuration is an INTEGER datatype. It seems they have overlooked the possibility that it's not an integer, but maybe I missed that nugget in the question.

It's fairly simple test case to demonstrate that:

callDuration BETWEEN 0 AND 30  

is NOT equivalent to

callDuration > 0 AND callDuration < 30  


Solution:2

Is there any reason you're not using between? The case statements themselves don't look too bad. If you really hate it you could throw all this into a table and map it.

Durations  ------------------  low   high   value  0     30     1  31    60     2  

etc...

(SELECT value FROM Durations WHERE callDuration BETWEEN low AND high) as Duration  

EDIT: Or, in a case where floats are being used and between becomes cumbersome.

(SELECT value FROM Durations WHERE callDuration >= low AND callDuration <= high) as Duration  


Solution:3

the case can be written like this:

case   when callduration >=30 and callduration<600 then floor(callduration/60)+2  when callduration>0 and callduration< 30 then 1   when callduration>=600 then 12  end  

The having is not needed, replace it by a "where callduration>0"

I like the translate table answer given before! that's the best solution


Solution:4

You need to push the CASE further down the query tree so that its projection is visible to the GROUP BY. This can be achieve in two ways:

  1. Use a derived table (already Spencer, Adam and Jeremy showed how)
  2. Use a common table expressions

    with duration_case as (  select callid ,  case      when callDuration > 0 and callDuration < 30 then 1      when callDuration >= 30 and callDuration < 60 then 2      when callDuration >= 60 and callDuration < 120 then 3      when callDuration >= 120 and callDuration < 180 then 4      when callDuration >= 180 and callDuration < 240 then 5      when callDuration >= 240 and callDuration < 300 then 6      when callDuration >= 300 and callDuration < 360 then 7      when callDuration >= 360 and callDuration < 420 then 8      when callDuration >= 420 and callDuration < 480 then 9      when callDuration >= 480 and callDuration < 540 then 10      when callDuration >= 540 and callDuration < 600 then 11      when callDuration >= 600 then 12  end as duration  from callmetatbl  where programid = 1001 and callDuration > 0 )     select count(callid), duration     from duration_case     group by duration  

Both solutions are equivalent in every respect. I find CTEs more readable, some prefer derived tables as more portable.


Solution:5

Divide callDuration by 60:

case          when callDuration between 1 AND 29 then 1          when callDuration > 600 then 12          else (callDuration /60) + 2  end  end as duration  

Note that between is inclusive of the bounds, and I'm assuming callDuration will be treated as an integer.


Update:
Combine this with some of the other answers, and you can get the entire query down to this:

select count(d.callid), d.duration  from (            select callid              , case                  when callDuration between 1 AND 29 then 1                  when callDuration > 600 then 12                  else (callDuration /60) + 2  end                end as duration          from callmetatbl          where programid = 1001                and callDuration > 0      ) d  group by d.duration  


Solution:6

select count(callid), duration from  (      select callid ,      case              when callDuration > 0 and callDuration < 30 then 1              when callDuration >= 30 and callDuration < 60 then 2              when callDuration >= 60 and callDuration < 120 then 3              when callDuration >= 120 and callDuration < 180 then 4              when callDuration >= 180 and callDuration < 240 then 5              when callDuration >= 240 and callDuration < 300 then 6              when callDuration >= 300 and callDuration < 360 then 7              when callDuration >= 360 and callDuration < 420 then 8              when callDuration >= 420 and callDuration < 480 then 9              when callDuration >= 480 and callDuration < 540 then 10              when callDuration >= 540 and callDuration < 600 then 11              when callDuration >= 600 then 12      end as duration      from callmetatbl      where programid = 1001 and callDuration > 0  ) source  group by duration  


Solution:7

Untested:

select  count(callid) , duracion  from      (select           callid,          case                      when callDuration > 0 and callDuration < 30 then 1                      when callDuration >= 30 and callDuration < 60 then 2                      when callDuration >= 60 and callDuration < 120 then 3                      when callDuration >= 120 and callDuration < 180 then 4                      when callDuration >= 180 and callDuration < 240 then 5                      when callDuration >= 240 and callDuration < 300 then 6                      when callDuration >= 300 and callDuration < 360 then 7                      when callDuration >= 360 and callDuration < 420 then 8                      when callDuration >= 420 and callDuration < 480 then 9                      when callDuration >= 480 and callDuration < 540 then 10                      when callDuration >= 540 and callDuration < 600 then 11                      when callDuration >= 600 then 12                      else 0          end as duracion      from callmetatbl      where programid = 1001) GRP  where duracion > 0  group by duracion  


Solution:8

Add all the cases into a table variable and do an outer join

DECLARE @t TABLE(durationFrom INT, durationTo INT, result INT)  --        when callDuration > 0 and callDuration < 30 then 1  INSERT INTO @t VALUES(1, 30, 1);  --        when callDuration >= 30 and callDuration < 60 then 2  INSERT INTO @t VALUES(30, 60, 2);    select count(callid) , COALESCE(t.result, 12)  from callmetatbl JOIN @t AS t ON callDuration >= t.durationFrom AND callDuration  < t.durationTo   where programid = 1001 and callDuration > 0  


Solution:9

Here's my shot at it. All of the components you need can be done in straight SQL.

select    count(1) as total   ,(fixedDuration / divisor) + adder as duration  from  (      select        case/*(30s_increments_else_60s)*/when(callDuration<60)then(120)else(60)end as divisor       ,case/*(increment_by_1_else_2)*/when(callDuration<30)then(1)else(2)end as adder       ,(/*duration_capped@600*/callDuration+600-ABS(callDuration-600))/2 as fixedDuration       ,callDuration      from         callmetatbl      where        programid = 1001      and         callDuration > 0  ) as foo  group by    (fixedDuration / divisor) + adder  

Here's the SQL I used for testing. (I don't have my own personal callmetatbl ;)

select    count(1) as total   ,(fixedDuration / divisor) + adder as duration  from  (      select        case/*(30s_increments_else_60s)*/when(callDuration<60)then(120)else(60)end as divisor       ,case/*(increment_by_1_else_2)*/when(callDuration<30)then(1)else(2)end as adder       ,(/*duration_capped@600*/callDuration+600-ABS(callDuration-600))/2 as fixedDuration       ,callDuration      from -- callmetatbl -- using test view below        (           select 1001 as programid,   0 as callDuration union         select 1001 as programid,   1 as callDuration union         select 1001 as programid,  29 as callDuration union         select 1001 as programid,  30 as callDuration union         select 1001 as programid,  59 as callDuration union         select 1001 as programid,  60 as callDuration union         select 1001 as programid, 119 as callDuration union         select 1001 as programid, 120 as callDuration union         select 1001 as programid, 179 as callDuration union         select 1001 as programid, 180 as callDuration union         select 1001 as programid, 239 as callDuration union         select 1001 as programid, 240 as callDuration union         select 1001 as programid, 299 as callDuration union         select 1001 as programid, 300 as callDuration union         select 1001 as programid, 359 as callDuration union         select 1001 as programid, 360 as callDuration union         select 1001 as programid, 419 as callDuration union         select 1001 as programid, 420 as callDuration union         select 1001 as programid, 479 as callDuration union         select 1001 as programid, 480 as callDuration union         select 1001 as programid, 539 as callDuration union         select 1001 as programid, 540 as callDuration union         select 1001 as programid, 599 as callDuration union         select 1001 as programid, 600 as callDuration union         select 1001 as programid,1000 as callDuration        ) as callmetatbl      where        programid = 1001      and         callDuration > 0  ) as foo  group by    (fixedDuration / divisor) + adder  

The SQL output is shown below, as 2 records counted for each duration (bucket) 1 through 12.

total  duration  2             1  2             2  2             3  2             4  2             5  2             6  2             7  2             8  2             9  2            10  2            11  2            12  

Here are the results from the "foo" sub-query:

divisor adder   fixedDuration  callDuration  120         1               1             1  120         1              29            29  120         2              30            30  120         2              59            59  60          2              60            60  60          2             119           119  60          2             120           120  60          2             179           179  60          2             180           180  60          2             239           239  60          2             240           240  60          2             299           299  60          2             300           300  60          2             359           359  60          2             360           360  60          2             419           419  60          2             420           420  60          2             479           479  60          2             480           480  60          2             539           539  60          2             540           540  60          2             599           599  60          2             600           600  60          2             600          1000  

Cheers.


Solution:10

What's so wrong with a User Defined Function here? You could both visually clean up the code and centralize the functionality that way. Performance-wise, I can't see the hit being too horrible unless you are doing something really retarded within said UDF.


Solution:11

Create a lookup table for duration
Using a look up table will speed up the SELECT statement as well.

Here is the end result of how it will look with lookup table.

select  count(a.callid), b.ID as duration  from    callmetatbl a          inner join DurationMap b            on a.callDuration >= b.Minimum          and a.callDuration < IsNUll(b.Maximum, a.CallDuration + 1)  group by  b.ID  

Here is the look up table.

create table DurationMap (      ID          int identity(1,1) primary key,      Minimum     int not null,      Maximum     int   )    insert  DurationMap(Minimum, Maximum) select 0,30  insert  DurationMap(Minimum, Maximum) select 30,60  insert  DurationMap(Minimum, Maximum) select 60,120  insert  DurationMap(Minimum, Maximum) select 120,180  insert  DurationMap(Minimum, Maximum) select 180,240  insert  DurationMap(Minimum, Maximum) select 240,300  insert  DurationMap(Minimum, Maximum) select 300,360  insert  DurationMap(Minimum, Maximum) select 360,420  insert  DurationMap(Minimum, Maximum) select 420,480  insert  DurationMap(Minimum, Maximum) select 480,540  insert  DurationMap(Minimum, Maximum) select 540,600  insert  DurationMap(Minimum) select 600  

Note:If u also have question or solution just comment us below or mail us on toontricks1994@gmail.com
Previous
Next Post »