Tutorial :Performance - Int vs Char(3)



Question:

I have a table and am debating between 2 different ways to store information. It has a structure like so

int id

int FK_id

varchar(50) info1

varchar(50) info2

varchar(50) info3

int forTable or char(3) forTable

The FK_id can be a foreign key to one of 6 tables so I need another field to determine which table it's for.

I see two solutions:

  • An integer that is a FK to a settings table which has its actual value.
  • A char(3) field with the a abbreviated version of the table.

I am wondering if anyone knows if one will be more beneficial speed wise over the other or if there will be any major problems using the char(3)

Note: I will be creating an indexed view on each of the 6 different values for this field. This table will contain ~30k rows and will need to be joined with much larger tables


Solution:1

In this case, it probably doesn't matter except for the collation overhead (A vs a vs ä va à)

I'd use char(3), say for currency code like CHF, GBP etc But if my natural key was "Swiss Franc", "British Pound" etc, I'd take the numeric.

3 bytes + collation vs 4 bytes numeric? You'd need a zillion rows or be running a medium sized country before it mattered...


Solution:2

Have you considered using a TinyInt. Only takes one byte to store it's value. TinyInt has a range of values between 0 and 255.


Solution:3

Is the reason you need a single table that you want to ensure that when the six parent tables reference a given instance of a child row that is guaranteed to be the same instance? This is the classic "multi-parent" problem. An example of where you might run into this is with addresses or phone numbers with multiple person/contact tables.

I can think of a couple of options:

Choice 1: A link table for each parent table. This would be the Hoyle architecture. So, something like:

Create Table MyTable(                      id int not null Primary Key Clustered                      , info1 varchar(50) null                      , info2 varchar(50) null                      , info3 varchar(50) null                      )    Create Table LinkTable1(                          MyTableId int not null                          , ParentTable1Id int not null                          , Constraint PK_LinkTable1 Primary Key Clustered( MyTableId, ParentTable1Id )                          , Constraint FK_LinkTable1_ParentTable1                              Foreign Key ( MyTableId )                              References MyTable ( Id )                             , Constraint FK_LinkTable1_ParentTable1                              Foreign Key ( ParentTable1Id )                              References ParentTable1 ( Id )                            )  ...  Create Table LinkTable2...LinkTable3  

Choice 2. If you knew that you would never have more than say six tables and were willing to accept some denormalization and a fugly design, you could add six foreign keys to your main table. That avoids the problem of populating a bunch of link tables and ensures proper referential integrity. However, that design can quickly get out of hand if the number of parents grows.

If you are content with your existing design, then with respect to the field size, I would use the full table name. Frankly, the difference in performance between a char(3) and a varchar(50) or even varchar(128) will be negligible for the amount of data you are likely to put in the table. If you really thought you were going to have millions of rows, then I would strongly consider the option of linking tables.

If you wanted to stay with your design and wanted the maximum performance, then I would use a tinyint with a foreign key to a table that contained the list of the six tables with a tinyint primary key. That prevents the number from being "magic" and ensures that you narrow down the list of parent tables. Of course, it still does not prevent orphaned records. In this design, you have to use triggers to do that.


Solution:4

Because your FK cannot be enforced (since it is a variant depending upon type) by database constraint, I would strongly consider re-evaluating your design to use link tables, where each link table includes two FK columns, one to the PK of the entity and one to the PK of one of the 6 tables.

While this might seem to be overkill, it makes a lot of things simpler and adding new link tables is no more complex than accommodating new FK-types. In addition, it is more easily expandable to the case where an entity needs more than a 1-1 relationship to a single table, or needs multiple 1-1 relationships to the 6 other entities.

In a varying-FK scenario, you can lose database consistency, you can join to the wrong entity by neglecting to filter on type code, etc.

I should add that another huge benefit of link tables is that you can link to tables which have keys of varying data types (ints, natural keys, etc) without having to add surrograte keys or stored the key in a varchar or similar workarounds which are prone to problems.


Solution:5

I think a small integer (tinyint) is called for here. An "abbreviated version" looks too much like a magic number.

I also think performance wise the integer should beat the char(3).


Solution:6

First off, a 50 character Id that is not globally unique sounds a little scary. Do the IDs have some meaning? If not, you can easily get a GUID in less space. Personally, I am a big fan of making things human readable whenever possible. I would, and have, put the full name in graphs until I needed to do otherwise. My preference would be to have linking tables for each possible related table though.

Unless you are talking about really large scale, you are much better off decreasing the size of the IDs and taking a few more characters for the name of the table. For really large scale, I would decrease the size of the IDs and use an integer.

Jacob


Note:If u also have question or solution just comment us below or mail us on toontricks1994@gmail.com
Previous
Next Post »