Tutorial :SQL/MySQL SELECT and average over certain values



Question:

I have to work with an analysis tool that measures the Web Service calls to a server per hour. These measurments are inserted in a database. The following is a snippet of such a measurement:

mysql> SELECT * FROM sample s LIMIT 4;    +---------+------+-------+    | service | hour | calls |    +---------+------+-------+    | WS04    |   04 |    24 |    | WS12    |   11 |    89 |    | WSI64   |   03 |    35 |    | WSX52   |   01 |    25 |    +---------+------+-------+    4 rows in set (0.00 sec)    

As the end result I would like to know the sum of all web services completions per hour of day. Obviously, this can be easily done with SUM() and GROUP BY:

mysql> SELECT hour, SUM(calls) FROM sample s GROUP BY hour;    +------+------------+    | hour | SUM(calls) |    +------+------------+    |   00 |        634 |    |   01 |        642 |    |   02 |        633 |    |   03 |        624 |    |   04 |        420 |    |   05 |        479 |    |   06 |        428 |    |   07 |        424 |    |   08 |        473 |    |   09 |        434 |    |   10 |        485 |    |   11 |        567 |    |   12 |        526 |    |   13 |        513 |    |   14 |        555 |    |   15 |        679 |    |   16 |        624 |    |   17 |        796 |    |   18 |        752 |    |   19 |        843 |    |   20 |        827 |    |   21 |        774 |    |   22 |        647 |    |   23 |        533 |    +------+------------+    12 rows in set (0.00 sec)    

My problem is that in old sets, the web service calls in the hours from [00-11] were already summed up. The simple statement as listed above would therefore lead to

mysql> SELECT hour, SUM(calls) FROM sample s GROUP BY hour;    +------+------------+    | hour | SUM(calls) |    +------+------------+    |   00 |       6243 | <------ sum of hours 00-11!    |   12 |        526 |    |   13 |        513 |    |   14 |        555 |    |   15 |        679 |    |   16 |        624 |    |   17 |        796 |    |   18 |        752 |    |   19 |        843 |    |   20 |        827 |    |   21 |        774 |    |   22 |        647 |    |   23 |        533 |    +------+------------+    13 rows in set (0.00 sec)  

This is an undesirable result. To make the old sets [00,12,...,23] comparable to the new sets [00,01,...,23] I would like to have one statement that averages the value of [00] and distributes it over the missing hours, e.g.:

+------+------------+    | hour | SUM(calls) |    +------+------------+    |   00 |    6243/11 |    |   01 |    6243/11 |                [...]    |   12 |        526 |                [...]    |   23 |        533 |    +------+------------+  

I can easily do this using temporary tables or views, but i don't know how to accomplish this without them.

Any ideas? Cause this is driving me crazy :P


Solution:1

You'll need a rowset with 12 rows in it to make a join.

The most simple solution will be combining 12 SELECT statements in a union:

SELECT  COALESCE(morning.hour, sample.hour),           SUM(CASE WHEN morning.hour IS NULL THEN calls ELSE calls / 12 END) AS calls  FROM    sample  LEFT JOIN          (          SELECT 0 AS hour          UNION ALL          SELECT 1          ...          UNION ALL          SELECT 11          ) AS morning  ON      sample.hour = 0 AND sample.service IN ('old_service1', 'old_service2')  GROUP BY          1  


Solution:2

You're probably best doing this with temp tables / views (I'd recommend a view over a temp table) or you will end up with a nasty case specific statement that will be a nightmare to manage over time.


Note:If u also have question or solution just comment us below or mail us on toontricks1994@gmail.com
Previous
Next Post »