Tutorial :SQL stored procedures design issue



Question:

I've lately seen a database where there was a table Types with columns Id, Key and Name.

Id was just an Id of the type, Key was a short key name for the type, for example "beer", and the Name was text that could be displayed for the user (for example, "Our greatest beers"). Id was of course unique and was a primary key for this table. Key was also unique. Other tables were always linked with table Types using its Id column, but stored procedures were always using Key for filtering (e.g. "X inner join Types on X.type_Id = Types.Id where Types.Key = 'beer' " instead of "X.type_Id = 3").

I thought about it as a bad approach. I would use Id rather than Key, even if i knew Key is unique. I think it would be possible (and ok) that the Key changes but the Id shouldn't change as it is used in another tables for linking. Are there any rules for not doing this? I mean if we changed Key "beer" to "beers", some of the stored procedures would stop working properly (and there actually was such a situation). For me it was quite intuitive that if the Id identifies the row in a table, we should always use the id, as the other attributes may change if needed and it should not cause problems. Am I right?


Solution:1

Key is a more meaningful and understandable way to access the data in the tables. Let me put it this way: would you rather debug this

SELECT ColumnA, ColumnB  FROM Table T  INNER JOIN Keys K  ON T.KeyId = K.KeyId  WHERE K.Key = 'Beer'  

Or

SELECT ColumnA, ColumnB  FROM Table T  WHERE T.KeyId = 103461  

When you have no idea what '103461' represents?

The same goes for stored procedures and other parameterized queries. Would you rather see

EXEC get_items_by_category 'Beer'  

Or

EXEC get_items_by_category 103461  

? The answer should really be obvious. Good, maintainable code is self documenting, and arbitrary IDs can't give you that.


Solution:2

There's only one (single field or multi field) primary key and it must be used always when doing JOINS. Anyway, in the specific domain (about which you are not telling us) it may make some sense to search for another field in specific queries. If, as you say, those fields can change, it's bad practice to hard code this queries.


Solution:3

Think of ID values as if they're memory addresses holding some variable's value. Would you prefer to refer the value by the variables' name or the memory address?

Personally I would never depend on a specific ID if it's auto increment. Especially if a row might be deleted. What if you dump the data one day and import it again, perhaps because you want to nake a fresh install. The SQL server will re-enumerate the ID and all your queries are broken.

EDIT: (answering your comment)

So your main argument is that in your scenario an ID will never change and a key can change? I would say that's bad design :). If you have to live with it, it seems obvious to use the ID, even if it's nondescriptive. Imho, the key has no value in this scenario if changing it is allowed and breaks lots of queries.


Solution:4

I agree with you in general, and disagree with darasd. I assume the id columns are surrogate keys - in other words, they are internal to the database only and never presented to the user. I think this (use of surrogate keys as the absolute unique identifiers) is the preferred approach in most scenarios. It allows you as you said to be able to better handle situations when the real-world descriptive key (eg "bees") is supposed to be unique but then in reality turns out not to be. The classic example of this is SSN when it is intended as a uniqueid but turns out in reality not to be.


Solution:5

Sounds like a bad design. I don't think there's fundamentally evil about it, after all, you can have multiple unique indexes for a table.

I can see where the id was an IDENTITY surrogate, you could not count on a certain value existing, so the natural key is used instead, but like you said, there's no guarantee that it exists, so your logic depending on it could break.

However, in that case, there are other designs which might work, like an IsBeer table which contained all the ids of "Beers" or a Flags table which had an IsBeer column (and an IsFood column), or something like that. Again, you cannot depend on them existing at all, but you would at least not have to worry about a column change breaking the logic, since you would have a FK relation.

To take welbog's example further, wouldn't you rather include the information from your application logic structurally in the database like:

SELECT ColumnA, ColumnB  FROM Table T  INNER JOIN Keys K      ON T.KeyId = K.KeyId      WHERE K.IsBeer = 1  

Note:If u also have question or solution just comment us below or mail us on toontricks1994@gmail.com
Previous
Next Post »