Tutorial :Re-using aggregate level formulas in SQL - any good tactics?



Question:

Imagine this case, but with a lot more component buckets and a lot more intermediates and outputs. Many of the intermediates are calculated at the detail level, but a few things are calculated at the aggregate level:

DECLARE @Profitability AS TABLE      (       Cust INT NOT NULL      ,Category VARCHAR(10) NOT NULL      ,Income DECIMAL(10, 2) NOT NULL      ,Expense DECIMAL(10, 2) NOT NULL      ,Liability DECIMAL(10, 2) NOT NULL      ,AllocatedCapital DECIMAL(10, 2) NOT NULL      ) ;    INSERT  INTO @Profitability  VALUES  ( 1, 'Software', 100, 50, 0, 0 ) ;   INSERT  INTO @Profitability  VALUES  ( 2, 'Software', 100, 20, 0, 0 ) ;   INSERT  INTO @Profitability  VALUES  ( 3, 'Software', 100, 60, 0, 0 ) ;   INSERT  INTO @Profitability  VALUES  ( 4, 'Software', 500, 400, 0, 0 ) ;   INSERT  INTO @Profitability  VALUES  (           5          ,'Hardware'          ,1000          ,550          ,0          ,0           ) ;   INSERT  INTO @Profitability  VALUES  (           6          ,'Hardware'          ,1000          ,250          ,500          ,200           ) ;   INSERT  INTO @Profitability  VALUES  (           7          ,'Hardware'          ,1000          ,700          ,500          ,600           ) ;   INSERT  INTO @Profitability  VALUES  (           8          ,'Hardware'          ,5000          ,4500          ,2500          ,800           ) ;     WITH    ProfitView            AS ( SELECT   Cust                         ,Category                         ,Income                         ,Expense                         ,Profit = Income - Expense                         ,NetProfit = Income - Expense                          - CASE WHEN Liability - AllocatedCapital > 0                                 THEN Liability - AllocatedCapital                                 ELSE 0                            END                 FROM     @Profitability               )      SELECT  Cust             ,Category             ,Income             ,Expense             ,Profit             ,NetProfit             ,Margin = Profit / Income             ,NetMargin = NetProfit / Income      FROM    ProfitView ; -- NOTE I've left off the AFTER grouping formulas on this one.    WITH    ProfitView            AS ( SELECT   Cust                         ,Category                         ,Income                         ,Expense                         ,Profit = Income - Expense                         ,NetProfit = Income - Expense                          - CASE WHEN Liability - AllocatedCapital > 0                                 THEN Liability - AllocatedCapital                                 ELSE 0                            END                 FROM     @Profitability               ),          GROUP1            AS ( SELECT   Category                         ,SUM(Profit) AS Profit                         ,SUM(NetProfit) AS NetProfit                         ,SUM(Income) AS Income                         ,SUM(Profit) / SUM(Income) AS Margin                         ,SUM(NetProfit) / SUM(Income) AS NetMargin                 FROM     ProfitView                 GROUP BY Category               ),          GROUP2            AS ( SELECT   GROUP1.*                         ,NetProfit - Profit AS Exposure                 FROM     GROUP1               )      SELECT  *             ,Exposure / Income AS ExposureRatio      FROM    GROUP2 ;    WITH    ProfitView            AS ( SELECT   Cust                         ,Category                         ,Income                         ,Expense                         ,Profit = Income - Expense                         ,NetProfit = Income - Expense                          - CASE WHEN Liability - AllocatedCapital > 0                                 THEN Liability - AllocatedCapital                                 ELSE 0                            END                 FROM     @Profitability               ),          GROUP1            AS ( SELECT   SUM(Profit) AS Profit                         ,SUM(NetProfit) AS NetProfit                         ,SUM(Income) AS Income                         ,SUM(Profit) / SUM(Income) AS Margin                         ,SUM(NetProfit) / SUM(Income) AS NetMargin                 FROM     ProfitView               ),          GROUP2            AS ( SELECT   GROUP1.*                         ,NetProfit - Profit AS Exposure                 FROM     GROUP1               )      SELECT  *             ,Exposure / Income AS ExposureRatio      FROM    GROUP2 ;  

Notice how the same formulae have to be used at the different aggregation levels. This results in code duplication.

I have thought of using UDFs (either scalar or table valued with an OUTER APPLY, since many of the final results may share intermediates which have to be calculated at the aggregate level), but in my experience the scalar and multi-statement table-valued UDFs perform very poorly.

Also thought about using more dynamic SQL and applying the formulas by name, basically.

Any other tricks, techniques or tactics to keeping these kinds of formulae which need to be applied at different levels in sync and/or organized?


Solution:1

Notice how the same formulae have to be used at the different aggregation levels. This results in code duplication.

If you functions were more complex, you could benefit from creating a custom CLR aggregate.

However, for such a simple function, a built-in SUM is the best.

Unlike PostgreSQL, SQL Server does not allow creating custom aggregates in a built-in language.


Solution:2

For your simplified example, I would refactor the calculations by returning the raw data (SUM(Income) and SUM(Expense)) separately in each resultset and calculate Profit and Margin in the business layer.

If that's not possible in the real case, could you make your simple example a little bit more complex so I can see what you're getting at?

I have worked on a project recently that needed complex business analysis calculations done inside queries. It turned out to be not possible to do them outside the data queries, so we eventually resorted to converting everything to dynamic SQL. This allowed us to construct macro functions to build the various parts of each query. By doing this we sacrificed readability but gained maintainability. We did not sacrifice testability because we wrote unit tests that exercised every possible code path through the macro functions, and logged each query as it was generated.


Solution:3

You could separate part of the complexity in a view:

create view dbo.vw_Profit  as  SELECT        Cust  ,   Income,  ,   Expense  ,   Income - Expense as Profit  FROM dbo.Profitability  

That allows for slightly simpler queries:

SELECT cust, SUM(profit), SUM(Income) / SUM(Expense)  FROM dbo.vw_Profit  GROUP BY cust  

The example query is hardly complex enough to warrant simplification with views. But views can be a great help with really complex queries.


Note:If u also have question or solution just comment us below or mail us on toontricks1994@gmail.com
Previous
Next Post »