Number Storage in SQL Server 13 Mar ’08

Posted by: Jason Laws

It’s always been important in SQL Server to specify the exact length and precision of number data types.  This is because SQL Server has always had just fixed length internal storage for numbers. 

For example, if you have a whole number to store but it can only been one or two digits long then declare it as SMALLINT. 

Recently I converted some tables from Oracle to SQL Server for benchmark testing.  Initially I left all the data type lengths the same as they were in Oracle (Oracle only has variable internal number storage). 

It seemed to take too long to load data into this table from a file, so I changed the data types of numeric fields to be just big enough for the data.  My load time halved! (and so did the space I was using for the table). 

Then I remembered that SQL Server 2005 introduced variable number storage.  The trick is you have to turn it on.  So I changed the table to have decimal data types for all numbers and turned on variable number storage by running the following:

- sp_db_vardecimal_storage_format ‘DatabaseName‘, ‘ON’

- sp_tableoption ‘TableName‘, ‘vardecimal storage format’, ‘ON’ 

I ran the load again and the table did take up a lot less space in the database, but the load time had increased.  I can only assume that the cost of calculating the size of the column somehow slowed things down…

Feedback

  1. Philippa 1:41 pm 13 Mar ’08

    I see that the latest version of WhereScape actually has a check box on the table properties storage tab that performs the setting of the vardecimal storage format at table create time.
    Regarding the performance of having that setting, my understanding is that the real overhead is on CPU, so it is useful to set on tables where you are IO bound and a very bad idea if you are already CPU bound. Saying that, obviously it is always going to be better to set to the specific ‘sized’ types like smallint .
    Another point that would be interesting is what impact the setting had on your read (rather than your test of write) performance, as I would imagine would most likely use this setting on a fact table rather than a load table

  2. Bruce W Cassidy 8:02 am 13 Mar ’08

    The decimal forms (including vardecimal) use BCD, which will store 2 digits per byte. BCD is also slower to perform calculations on, as it uses binary representation to approximate decimal notation, and things like overflow/underflow as a result of operations is slower.

    The integer variants (tinyint, smallint, etc) use native integer storage, and that’s as fast as you can get. So the same number stored in an integer form will always be faster than one stored in BCD (decimal) form.

    So it doesn’t surprise me at all that when you moved numbers from integer forms into any form of decimal (including vardecimal, which will have the additional overhead of having to calculate the amount of required storage) you saw a slow down.

    Picking the right size to store a field has always been a bit of an art. Vardecimal makes it a bit easier, but it still pays to think about what is best for what you are storing.

Contribute



E-mail addresses are not published and are kept confidential.




* - denotes required field.